Stratified Random Sampling
Definition
Stratified Random Sampling — Meaning, Definition & Full Explanation
Stratified random sampling is a statistical technique in which a population is divided into non-overlapping subgroups (called strata) and random samples are drawn from each stratum in proportion to its size. This method ensures that all key subgroups within a population are represented fairly in the final sample, reducing bias and improving accuracy compared to simple random selection.
What is Stratified Random Sampling?
Stratified random sampling divides a heterogeneous population into homogeneous layers based on shared characteristics—such as age, income, gender, geography, or industry sector—then randomly selects samples from within each layer. The key principle is that elements within each stratum are similar to one another, while elements across strata differ meaningfully. By sampling from each stratum separately, researchers ensure that minority or important subgroups are not accidentally under-represented.
There are two allocation approaches: proportionate (where sample size from each stratum matches its proportion in the population) and disproportionate (where certain strata are deliberately over-sampled to enable detailed analysis). The non-overlapping nature of strata is critical; if boundaries blur, some individuals gain higher selection probability than others, compromising the sample's validity. Stratified random sampling is widely used in market research, quality audits, customer satisfaction studies, and regulatory compliance surveys—anywhere a population contains distinct, meaningful segments that warrant separate attention.
Free • Daily Updates
Get 1 Banking Term Every Day on Telegram
Daily vocab cards, RBI policy updates & JAIIB/CAIIB exam tips — trusted by bankers and exam aspirants across India.
How Stratified Random Sampling Works
The process unfolds in five key steps:
Identify the population and define strata: Determine which characteristic(s) will divide the population. For a bank studying loan defaults, strata might be loan type (retail, agricultural, MSME), region (metro, urban, rural), or credit score band (excellent, good, fair, poor).
Classify every unit into a stratum: Assign each member of the population to exactly one stratum. No overlap is permitted; a borrower cannot belong to two strata simultaneously.
Determine stratum sizes: Calculate the proportion of the total population in each stratum. If a bank has 100,000 retail customers (60%), 30,000 agricultural borrowers (18%), and 70,000 MSME borrowers (22%), these proportions are noted.
Decide allocation method: In proportionate allocation, if you need a sample of 1,000, you would draw 600 from retail, 180 from agricultural, and 220 from MSME. In disproportionate allocation, you might oversample agricultural borrowers (say 400) to study them in depth despite their smaller population.
Randomly select samples within each stratum: Use a random number generator or lottery system to pick individuals from each stratum independently. This ensures no selection bias within strata.
The result is a sample that mirrors the population's composition, with statistical properties that improve confidence intervals and reduce sampling error.
Stratified Random Sampling in Indian Banking
Stratified random sampling is integral to Indian banking and financial regulation. The Reserve Bank of India (RBI) employs this technique in its regulatory stress tests and on-site inspections of banks. When conducting concurrent audits, bank internal audit teams stratify loan portfolios by product type (term loan, cash credit, overdraft), borrower category (corporate, MSME, retail, priority sector), and loan size to ensure representative samples are tested. This approach helps detect fraud, compliance breaches, and credit quality issues across all portfolio segments.
The RBI's guidelines on priority sector lending compliance require banks to verify their advances across sectors using stratified sampling to avoid under-reporting. Similarly, when banks conduct Know Your Customer (KYC) audits or fraud investigations, they stratify customer accounts by transaction volume, geography, and tenure to ensure suspicious patterns are captured across all segments, not just high-value accounts.
In loan classification under Prudential Norms, banks stratify advances by stage of deterioration (Stage 1, Stage 2, Stage 3 under IFRS 9) and conduct stratified audits to validate correct classification. The National Payments Corporation of India (NPCI) uses stratified sampling when auditing payment system transactions by value bracket and institution type. This technique appears in CAIIB syllabi under statistics and audit methodology modules, particularly when studying sampling design for regulatory examinations and credit assessments.
Practical Example
Haryana Rural Cooperative Bank wants to audit its retail loan portfolio of 50,000 borrowers for documentation completeness and disbursement accuracy. Management decides to sample 500 borrowers (1% of the base). They stratify the portfolio into four groups: salaried employees (25,000), self-employed traders (15,000), agricultural borrowers (8,000), and pensioners (2,000).
Using proportionate allocation, they randomly select: 125 salaried borrowers, 75 traders, 40 agricultural borrowers, and 10 pensioners. The audit team then independently reviews documentation, verifies loan purpose, and checks fund flow for each stratum. This ensures that weaker governance in the smaller pensioner segment (2,000 people) is not masked by the larger salaried group. If simple random sampling had been used, the 10 pensioners might have been missed entirely, leaving a compliance gap.
Stratified Random Sampling vs. Simple Random Sampling
| Aspect | Stratified Random Sampling | Simple Random Sampling |
|---|---|---|
| Population division | Divided into homogeneous strata | No division; entire population treated as one |
| Representation | Guarantees all subgroups are represented | No guarantee of subgroup representation |
| Accuracy | Higher precision; lower sampling error | Lower precision; higher sampling error |
| Complexity | More complex; requires stratum definition | Simple to execute; minimal planning |
Stratified random sampling is superior when the population contains meaningful, distinct segments and you need reliable estimates for each segment. Simple random sampling works well for homogeneous populations where no subgroup is critical to isolate. In banking, stratified sampling is mandatory for audit, compliance, and risk analysis because loan portfolios always have heterogeneous risk profiles across product, sector, and borrower type.
Key Takeaways
- Stratified random sampling divides a population into non-overlapping subgroups (strata) and randomly samples from each stratum to ensure fair representation.
- Strata must be mutually exclusive; no individual can belong to more than one stratum, or selection probability becomes unequal.
- Proportionate allocation matches sample size to stratum size; disproportionate allocation intentionally over-samples smaller but important segments.
- This technique reduces sampling error and improves statistical precision compared to simple random sampling, making it preferred in banking audits and RBI compliance studies.
- Indian banks use stratified sampling for priority sector lending audits, loan classification validation, and concurrent audit programs across product and borrower categories.
- The method is particularly effective when analyzing subgroup differences or ensuring minority segments (e.g., agricultural borrowers, retail borrowers in a corporate bank) are not under-represented.
- Stratified random sampling is part of the CAIIB curriculum for audit methodology and is essential for designing robust credit risk sampling frameworks.
Frequently Asked Questions
Q: What is the difference between stratified random sampling and stratified non-random sampling? A: Stratified random sampling uses randomness to select units within each stratum, ensuring every member has a known, non-zero chance of selection. Stratified non-random sampling (also called stratified purposive sampling) uses judgment to pick specific units within strata, which introduces selection bias. In Indian banking audits, stratified random sampling is preferred to maintain compliance integrity.
Q: Can I use stratified random sampling if I don't know the stratum sizes beforehand? A: No. Effective stratified random sampling requires knowing the population size and stratum proportions before sampling begins. If stratum sizes are unknown, you would use cluster sampling or stratified quota sampling instead. RBI audit teams always maintain borrower databases with classification data to enable proper stratification.
Q: How do I decide which characteristic to use for stratification? A: Choose characteristics that are strongly related to the outcome you are measuring. In loan quality audits, stratify by loan product, geography, and borrower type because these predict default risk. In customer satisfaction surveys, stratify by customer segment and tenure. The goal is to make within-stratum variation small and between-stratum variation large.