Skip to main content

Data Quality Tools

πŸ”Ž Interpretation & Use Cases​

The Data Quality Tools are a JavaScript module designed to integrate with your application and make API calls to our device fingerprinting and fraud detection services. By analyzing a variety of signals-including device characteristics (e.g., browser configuration), bot detection, network indicators (e.g., Tor usage), and participant historyβ€”these tools provide valuable context for each survey response. This enables our customers to identify and filter out low-quality or fraudulent respondents.

Key Use Cases

  • Device Metrics: Analyze device information to detect suspicious behaviors such as bot activity, automation scripts, or location spoofing.
  • Location Verification: Use country and subdivision data to confirm geographic eligibility and flag potential anomalies.
  • Duplicate Detection: Leverage digital fingerprinting to identify and block multiple submissions from the same respondent.
  • Participant History: Track participant behavior across surveys to detect repeat offenders and maintain high-quality data over time.

πŸ“‹ Example JSON Response​

Below is an example of a JSON response from our quality tool.

{
"participantId": "q6VCgbAXzpZCtZNCL0LF",
"country": "US",
"subdivision": "California",
"isDuplicate": false,
"surveyId": "example.com/survey/123",
"deviceScore": 85,
"averageDeviceScore": 78.5,
"lowestDeviceScore": 50,
"totalSurveys": 5,
"deviceFailures": [
"Bot Detection",
"Incognito Mode",
"Virtual Machine"
],
"completionRate": 20,
"duplicationRate": 15,
"failureRate": 32,
"qualificationRate": 22,
"manualISQRate": 10,
"automatedISQRate": 15,
"osqRate": 20,
"lastSurveyTaken": "2024-10-24T18:58:48.926Z",
"brandFamiliarity": 10,
"openEnd": 15,
"speeding": 20,
"honeyPot": 10,
"straightlining": 15,
"distinctSupplierCount": 3
}

πŸ“Œ Response Attributes - General​

participantId (String)​

  • A unique identifier assigned to the participant browser.
  • Used to track responses across multiple surveys while maintaining anonymity.

country (String)​

  • The detected country of the participant based on their IP address.
  • Helps with fraud prevention by ensuring participants are accessing from expected locations.
  • If the country is not identified, it will be returned as 'ZZ'.

subdivision (String)​

  • Represents the regional subdivision (e.g., state, province, or department) within the country.
  • Adds an additional layer of location validation.
  • If the subdivision is not identified, it will be returned as '-'.

lastSurveyTaken (String)​

  • The ISO 8601 timestamp of the participant's most recent survey attempt.
  • Format: YYYY-MM-DDTHH:mm:ss.sssZ
  • Helps identify active participants and assess their recent engagement levels.
  • If no survey history exists, it will be returned as No data.

isDuplicate (Boolean)​

  • Indicates whether this participant has submitted multiple responses using the same device.
    • true β†’ Duplicate detected (User has refreshed or retaken the survey multiple times. Once detected, this flag remains true for all subsequent attempts).
    • false β†’ Unique submission (No duplicate detected, each first request per participant per suveyId it returns false).

surveyId (String)​

  • Identifies the specific survey the participant is taking while the request was made.
  • Can be represented as a URL or unique identifier assigned to a survey instance.
  • By default, the URL hostname + URL pathname is used.
  • A custom string value can also be set as the surveyId on the toolbox integration.

distinctSupplierCount (Number)​

  • Represents the number of unique suppliers associated with the participant.
  • Helps identify participants who may be working with multiple survey providers.
  • A higher count may indicate a professional survey taker who participates across various platforms.
  • Useful for fraud detection and quality assessment of survey responses.

πŸ“Œ Response Attributes - Device Score​

  • These values are calculated in real time for each survey request. The device score can decrease in real time based on the device failures we detect.
  • The properties averageDeviceScore, lowestDeviceScore, and totalSurveys are calculated in real time across all requests in the Quality Tools from all clients.

deviceScore (Number)​

  • Represents the trust level of the participant's device.
  • The score ranges from 0 to 100, categorized as follows:
    • 🟒 75 - 100: High Score (Trusted Device)
    • 🟑 50 - 75: Medium Score (Moderate Risk)
    • 🟠 25 - 50: Low Score (High Risk)
    • πŸ”΄ 0 - 25: Lowest Score (Extreme Risk)
  • The score is influenced by various device attributes, including:

averageDeviceScore (Number)​

  • Represents the average device score across all requests made by the same participantId.
  • Includes every request across all surveys, on both your platform and any other client platform.

lowestDeviceScore (Number)​

  • The lowest recorded device score for the participantId across all requests.
  • Helps assess participant risk levels by identifying historical low scores.
  • Includes every request across all surveys, on both your platform and any other client platform.

totalSurveys (Number)​

  • Represents the total number of surveys taken by the participant.
  • This represents the total number of surveys across all Data Quality Tools clients' survey platforms.
  • A higher number may indicate a frequent participant, which could be either legitimate or a professional survey taker.

deviceFailures (Array of strings)​


πŸ“Œ Response Attributes - DQC Participant Survey History​

As part of our ongoing improvements to fraud detection and participant evaluation, the Quality Tools has been enhanced to provide additional participant history performance metrics from the Data Mapping and Dispositions. These metrics, which are calculated based on historical survey behavior, provide valuable insights into participant quality.

All metrics are calculated using the participant's full transaction history, using either via their participantId or via their thirdPartyId associated with the thirdPartyIDProvider as "DQC".


  • All values are percentages represented as whole numbers (e.g., 22.5 β†’ 23).
  • These metrics are calculated once a day. In case you upload your survey transactions to the DQC Dashboard, you may not see live changes on this properties.
  • If no data is found in the DQC dashboard, the "No data" default value will be returned.
{
"completionRate": "No data",
"duplicationRate": "No data",
"failureRate": "No data",
"qualificationRate": "No data",
"manualISQRate": "No data",
"automatedISQRate": "No data",
"osqRate": "No data"
}

completionRate (Number)​

  • Percentage of Qualified Complete or Flagged Complete over total attempts.

  • Calculated as:

    (Qualified Complete + Flagged Complete) / Total Attempts Γ— 100
  • Indicates how often the participant completes a survey successfully.

  • Useful for identifying high-engagement respondents.


duplicationRate (Number)​

  • Percentage of survey attempts flagged as duplicates.

    Duplicate / Total Attempts Γ— 100
  • Helps detect participants who repeatedly try to access or complete the same survey using the same device or identity.


failureRate (Number)​

  • Percentage of survey attempts that failed key quality checks.

    (OSQ + Manual ISQ + Automated ISQ) / Total Attempts Γ— 100
  • Represents failed attention checks, inconsistent answers, or detected automation.


qualificationRate (Number)​

  • Rate at which the participant qualifies for surveys, calculated over meaningful outcomes:

    (Qualified Complete + Flagged Complete) /
    (Qualified Complete + Flagged Complete + Duplicate + Did Not Qualify + Quota Full) Γ— 100
  • A high rate suggests that the participant typically matches targeting criteria and avoids disqualification.


manualISQRate (Number)​

  • Percentage of surveys where the participant failed manual In-Survey Quality (ISQ) checks.

  • Calculated as:

    Manual ISQ / Total Attempts Γ— 100
  • Manual ISQ checks are quality control measures implemented by survey creators.

  • High percentages indicate poor attention or intentional deception.


automatedISQRate (Number)​

  • Percentage of surveys where the participant failed automated In-Survey Quality (ISQ) checks.

  • Calculated as:

    Automated ISQ / Total Attempts Γ— 100
  • Automated ISQ checks are system-generated quality validations.

  • Helps identify participants with consistent quality issues.


osqRate (Number)​

  • Percentage of surveys where the participant failed Out-of-Survey Quality (OSQ) checks.

  • Calculated as:

    OSQ / Total Attempts Γ— 100
  • OSQ checks occur before or after the main survey content.

  • High percentages suggest professional survey takers or fraudulent behavior.


πŸ“Œ Response Attributes - DQC Quality Checks​

These attributes provide detailed breakdowns of specific Quality Checks types that participants have encountered across their survey history.

  • All values are percentages represented as whole numbers (e.g., 22.5 β†’ 23).
  • These metrics are calculated once a day. In case you upload your survey transactions to the DQC Dashboard, you may not see live changes on this properties.
  • If no data is found in the DQC dashboard, the "No data" default value will be returned.
{
"brandFamiliarity": 0,
"openEnd": 0,
"speeding": 0,
"honeyPot": 0,
"straightlining": 0
}

brandFamiliarity (Number)​

  • Percentage of surveys where the participant demonstrated brand familiarity or recognition.
  • Indicates how often the participant shows awareness of brands or products in surveys.
  • Useful for identifying participants who may have professional survey-taking behavior.

openEnd (Number)​

  • Percentage of surveys where the participant provided meaningful open-ended responses.
  • Measures the quality and depth of qualitative feedback provided by the participant.
  • Low percentages may indicate automated responses or low-quality participation.

speeding (Number)​

  • Percentage of surveys where the participant completed the survey too quickly.
  • Indicates rushed responses that may compromise data quality.
  • High percentages suggest professional survey takers or automated responses.

honeyPot (Number)​

  • Percentage of surveys where the participant failed honey pot checks.
  • Honey pot checks are hidden fields designed to catch automated form submissions.
  • High percentages indicate bot activity or automated survey completion.

straightlining (Number)​

  • Percentage of surveys where the participant engaged in straightlining behavior.
  • Straightlining occurs when participants select the same response option repeatedly.
  • Indicates low-quality responses or automated survey completion.