Sub-Poll Series :The proof in the pudding
I’ve been working on the SubPollSeries for some time now, mainly as an academic exercise in my retirement, but it’s also very useful for political betting. With full Scottish polls few and far between, outside of elections, the Sub-Poll Series offers a weekly insight into Scottish political opinion, and can be especially good at identifying any early swings in pro-nationalist voting.
However I still get a lot of skeptics, especially in the media, as they have been trained (quite rightly) to dismiss sub-polls. However a sub-poll series is not the same thing as a sub-poll. The very reasons one would reject an individual sub-poll arguably do not exist in the sub-poll series; I’ve set most of this out here.
However I realized that there was a relatively good way of testing the hypothesis when I picked up the YouGov nearly comprehensive database of weekly polling from 2020 through to the 2024 election (here). Now I had a house specific Sub-Poll Series over a longer period of time that I could compare against an election cycle’s worth of full Scottish polling (here).
I compared the YouGov Sub-Poll Series to full YouGov polls (to compare house effect to house effect) as well as the YG SBS to an aggregate of all pollsters.
This note sets out my methodology (some of it by AI which is why it’s very detailed but I suspect few people will object to that) for the comparison and the results.
TL:DR statistically speaking the Sub-Poll Series is just as valid as a full Scottish only poll. Ignore the data at your peril.
Data Sources
- Sub-Poll Data: YouGov weekly sub-polls from January 2020 to April 2026 (293 data points)
- FULL Poll Data: 125 polls from various pollsters (Savanta, Ipsos, YouGov, Survation, Redfield & Wilton, etc.) from March 2020 to July 2024
Data Preparation Steps
- Transposing the sub-poll data from wide to long format
- Adding Year and Week Number columns
- Sorting by Year and Week Number
- Interpolating missing weeks (44 weeks filled using surrounding values)
- Standardizing FULL poll dates (converting date ranges to midpoints)
- Removing political event annotations (12 events removed)
Comparison Methodology SPS
6-week trailing average for the Sub-Poll Series as it smooths out noise while remaining responsive to genuine shifts in opinion.
Statistical Measures
Bias, MAE, Standard Deviation, Correlation (see below for a primer if you need one)
Results
- Sub-Polls vs ALL Pollsters:
SNP 93.5% correlation, Con 78.4%, Lab 93.5% - YouGov vs YouGov: SNP 95.6% correlation, Con 85.8%, Lab 91.0%
Conclusions
Sub-Polls vs All FULL Polls
The 6-week averaged sub-polls show remarkably strong agreement with FULL polls across multiple pollsters. SNP and Labour achieve >93% correlation with only ~2-3% MAE. This validates the sub-poll methodology as a reliable tracking tool.
YouGov vs YouGov Comparison
When comparing like-for-like (YouGov sub-polls vs YouGov FULL polls), accuracy improves further - SNP correlation rises to 95.6%, Conservative to 85.8%. The Green party correlation jumps dramatically from 43% to 78%, showing different pollsters measure smaller parties quite differently.
Implications for Polling Methodology
The research supports using averaged sub-polls as a cost-effective alternative to expensive FULL polls. A mix of sub-poll providers compared against diverse FULL polls should provide even more robust results - which aligns with the current Sub-Poll Series.
Metric | Definition |
Bias (Average Difference) | The mean of all differences between FULL polls and Sub-poll averages; a positive value means Sub-polls tend to underestimate, negative means they overestimate. |
Mean Absolute Error (MAE) | The average of the absolute differences (ignoring positive/negative signs), showing the typical size of the error regardless of direction. |
Standard Deviation (Std Dev) | Measures how spread out the differences are around the mean; a lower value indicates more consistent, predictable errors. |
Max Difference | The largest positive difference observed (where FULL poll was highest above the Sub-poll average). |
Min Difference | The largest negative difference observed (where FULL poll was furthest below the Sub-poll average). |
Correlation | A value from -1 to +1 measuring how closely the two data series move together; values near +1 indicate strong agreement in trends and direction. |
Count | The number of data points (poll comparisons) used to calculate each statistic. |


Comments
Post a Comment