SIA
SignalIQ
About the data
← Back to SignalIQ
SignalIQ · Methodology & Data Sources

How SignalIQ works.
What it measures. What it doesn’t.

SignalIQ is a signal-vs-coverage gap detector, not a prediction engine. It tells you where primary-source activity is surging ahead of press coverage — not whether a story will break. Every score links back to verifiable, open data.

§ 01 · What the scores mean

A score of 85/100 does not mean there is an 85% chance this story breaks in the press. It means this topic has a high signal-to-coverage gap right now: a lot of activity in primary sources (filings, papers, forum discussion) relative to how much the press has covered it. That gap is your opportunity window — not a guarantee.

Stories with high scores can go nowhere. Stories with low scores can explode overnight. SignalIQ gives you a data-backed starting point for your pitch research — the judgment call of whether and how to pitch is still yours.

The badge = lead/whitespace score. Think of it as “how far ahead of the coverage are you?” — not “how likely is this to get covered?”

§ 02 · Data sources

SignalIQ pulls from five open, primary-source databases. No paywalled data. No stale training data. Every signal links back to its original source — you can click through and verify anything we surface.

SEC EDGARFederal FilingsSignalCredibility: 95%

The SEC's full-text search index (EFTS). SignalIQ counts how many filings mention a keyword in the past 30 days and compares that to the prior-month baseline. A surge in disclosures — earnings calls, 10-Ks, 8-Ks — often precedes mainstream press coverage by days to weeks. Federal-grade, primary-source data: corporations are legally required to disclose material information, so these signals carry real weight.

Commercial use: Public domain — US government data.

GDELT DOC 2.0Global News MonitorCoverage denominatorCredibility: 80%

GDELT indexes virtually every news article published online worldwide. SignalIQ uses it as the coverage denominator: if a topic appears in 0.3% of all global news, it has moderate coverage. The Coverage Gap score is the difference between signal volume (primary sources) and coverage volume (news). A wide gap means a real story is emerging that journalists haven't caught up to yet.

Commercial use: Open, free global news-data project. Attribution appreciated.

arXivAcademic PreprintsSignalCredibility: 80%

arXiv hosts academic preprints — research papers published before peer review. Academic preprint volume is a leading indicator: research attention typically precedes mainstream press coverage by weeks to months. SignalIQ counts paper submissions matching a keyword in the past 30 days. We use only metadata (title, date, link) — not full paper text.

Commercial use: Metadata freely reusable under arXiv's API terms.

WikipediaEdit-Surge DetectorSignalCredibility: 65%

Wikipedia's pageview API returns daily view counts for any article. A spike in views — especially across a cluster of related articles — reveals when a topic is being actively researched en masse. This often precedes journalist interest by one to three weeks: journalists research before they write. We use view counts only, not article text.

Commercial use: View counts are open data, freely usable.

Hacker NewsTech Forum VelocitySignalCredibility: 55%

Hacker News's Algolia API surfaces stories and comments. SignalIQ measures points and comment velocity for the top matching stories in the past 30 days. HN skews toward tech, SaaS, and AI stories — it's a leading indicator for those beats but a lagging one for health, climate, or fintech stories where the community is smaller.

Commercial use: Free, public API.

§ 03 · How the score is calculated

The opportunity score is a weighted composite across five components, with a corroboration bonus added on top. Coverage gap carries the most weight — the whole premise of SignalIQ is that the gap between signal and press coverage is the opportunity.

Coverage gap30%How thin is press coverage relative to signal volume? The bigger the gap, the bigger the opportunity window. This is the heaviest component.
Signal magnitude25%Raw volume of signals — filing counts, paper counts, view counts. More signal = more real activity.
Signal velocity22%How fast is signal volume growing? A topic with 10 filings this month vs. 1 last month scores higher than one steady at 50.
Beat fit13%How closely does this topic match the selected beat? Prevents off-topic results from surfacing high.
Source credibility10%Weighted average credibility of the sources that returned data. An SEC-only signal scores higher than a Hacker News–only signal.
Corroboration bonus+15% maxA bonus added when multiple independent sources confirm the same topic. One source is a hint. Three is a story.

BAND THRESHOLDS: Hot lead ≥ 80 · Worth a look 60–79 · Early 40–59 · Noise / late < 40

§ 04 · Beta transparency — what we hardcoded and why

SignalIQ is in beta. Two deliberate limitations are worth knowing about.

20 seeds per beat

Each beat scans 20 pre-written search phrases (seeds) across all five data sources. For example, the Health & Wellness beat searches for “GLP-1 drugs”, “chronic disease management”, “clinical AI”, and 17 others. The results you see are drawn from these 20 topics only.

We chose 20 because it balances coverage with cost and speed. Fewer seeds meant too many gaps (the original v1 had only 5). More seeds means slower scans and harder-to-audit results. The seeds are hand-curated by journalists and PR practitioners who cover each beat.

This is intentional for the beta. In a future version, pro users will be able to add custom seeds for their specific niche — but we wanted to ship a reliable, auditable tool first.

5 beat categories

SignalIQ currently covers five beats: SaaS & startups, Fintech, Health & wellness, Climate & energy, AI, Cybersecurity & Privacy. These were chosen because they represent the highest-volume PR beats for the startup and scale-up companies most likely to use this tool.

The “right” beat for your company is not always your industry — it’s the vertical your target journalists cover. A health-AI company should usually choose Health & Wellness, not SaaS, unless the story is about the startup itself (a funding round, a product launch to tech press).

More beats (Legal & Policy, Consumer, Media & Publishing, etc.) are planned for future versions. If you need a beat that isn’t here, let us know.

Your startup context — what it does and doesn’t do

Adding your startup context re-ranks the scan results by keyword relevance to your description, and personalises the pitch angle in your asset pack. It does not change what topics are scanned.

We deliberately kept the scan beat-wide. A tool that only surfaces topics directly matching your company description would miss adjacent opportunities — often the most interesting ones. The market radar should be broader than your current pitch list.

§ 05 · How we use your data

SignalIQ collects your email address (if you choose to unlock more scans) and the startup context you optionally provide. We use your email to send SIA’s earned-media newsletter and to manage your scan quota. We do not sell your data or share it with third parties. Your startup context is used only to generate your asset pack and is not stored beyond the current session.

Unsubscribe from the newsletter at any time via the link in any email. For the full policy, see our Privacy Policy.

← Back to SignalIQ