Researchopedia
ISBN: 978-93-93166-28-9
For verification of this chapter, please visit on http://www.socialresearchfoundation.com/books.php#8

Sampling

 Dr. Nidhi Shukla
Assistant Professor
Physiotherapy Department
Rama University
Kanpur  Uttar Pradesh, India 

DOI:10.5281/zenodo.8395365
Chapter ID: 17998
This is an open-access book section/chapter distributed under the terms of the Creative Commons Attribution 4.0 International, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Sampling is simply the process of learning about the population on the basis of a sample drawn from it. Thus, in the sampling technique instead of every unit of the universe only a part of the universe is studied and the conclusions are drawn on that basis for the entire universe. A sample is a subset of population units.

The process of sampling involves three elements:

1. Selecting the sample,

2. Collecting the information, and

3. Making an inference about the population.

Sampling is the procedure by which some members of the population are selected as representative of the entire population.

Study Population-The study population is the population to which the result of the study will be inferred. the word population means the entire spectrum of a system of interest.

Study population depend upon the research question.

Sample needs to be representative of the population in term of the time-

Seasonality, Day of the week, Time of the day

Place-Urban, Rural

Persons-Age, Sex

Other demographic details

Sampling Unit: - Elementary unit that will be sampled: -

People, Health care workers, Hospitals

Sampling frame: - first of all sampling units in the population

Sampling scheme: - methods used to select sampling units from the sampling scheme.

Q- why do we sample population?

1.Obtain info from large population

2. Ensure the efficiency of a study

3. Obtain more accurate information

Types of Sample:-

 

1-   Random Probability Sampling: - every unit in the population has a known probability of being selected. Only sampling method that allows to draw valid conclusion about population. Removes probability of bias in selection of subjects  Ensures that each subject has a known probability of being chosen .Allows application of statical theory.

2-    Non- Probability Sample: - Probability of being selected is unknown

(a)   Convenience sample-

i. Biased

ii. Best or worst scenario

(b)  Subjective Samples-

i. Based on knowledge

ii. Time/resources constraints

Method of Sampling: -

1. Simple Random

2. Systematic

3. Stratified

4. Cluster

5. Multistage

1-    Simple Random Sampling: - A random sample is one taken such that every item in the population defined in the research has an equal chance of being selected. Equal chance for each sampling unit Unrestricted random sampling is carried out with replacement, i.e. the item selected at each draw is 'returned' to the population before the next draw is made. Thus, any given unit can appear more than once in a sample. Simple random Sampling is random sampling without replacement, and this is the form of random sampling most used in practice. Number of all units randomly drawn

1. Advantage- Simple, sampling error easily measured

2. Disadvantage- need complete list of units, does not always achieve best representation

2-    Systematic Sampling: - This method begins with the calculation of the sampling fraction to be used. Suppose the sample size is n and the sample frame comprises N items. Thus, the sampling fraction is given by c = N/11A unit drawn every (k) unit Systematic sampling gives a more even spread of the sample over the sample frame than does random sampling.

1.Equal chance of being drawn

2. Calculate sampling interval (k=N/n)

3. Drawn random no for starting

3. Every k unit from 1st unit

Advantage- ensure representatively across list, Easy to implement.

Disadvantage- dangerous if list has cycles

3-  Stratified Sampling: - With simple random sampling and for a given population and sample size, it is the variability of whatever characteristic is under investigation that determines the precision of any estimate made. The greater the variability, the poorer is the precision for a given sample size. Thus, the idea underlying stratification is that a researcher may be able to utilise prior knowledge about the level of what is being measured in the population.  classify population into homogenous strata

1. Draw sample in each stratum

2. Combine result of all strata

Advantage- more precise if variable associated with strata

 All subgroups represented, allowing separate conclusion about each of them

Disadvantage- sampling error difficult to measure

Loss of precision if small number sampled in individual strata, Estimate vaccination coverage in a country

4-    Cluster Sampling- This form of sampling has the attraction of being a probability sample without having the need for a sampling frame. There is also the attraction of lowering the field costs by reducing the amount of travelling necessary. These features come about because cluster sampling is based upon the idea of sampling complete subunits. Random sample of groups of units. All od proportion of units included selected clusters.

Advantage- simple, no list required less travel/resources required

Disadvantage- imprecise if homogenous, Sampling error difficult to measures

Sampling unit is not a subject but a group of subjects. Assumed that variability among clusters is minimal. Variability within each cluster is what is observed in the general population.

Two stage of cluster sample: -

1-    Probability proportional to size-

i.Select number of clusters to include

ii. Compute cumulative list of population

iii. Divide ground total by number of clusters to obtain sampling interval

iv. Choose random number and identify first cluster

v. Add sampling interval and identify second cluster

vi. Identified all clusters

2-    In each cluster select a random sample using a sampling name of subjects

5-    Multistage Sampling: - In much commercial sample survey work it is necessary to carry out a survey using two or even three stages of sampling. The need arises from economic considerations when the geographical area to be covered is very extensive and travel costs need to be minimised. Although multi-stage sampling is not likely in the work of a first-time researcher, we shall outline the process for two-stage sampling because some readers may find it helpful. Conceptually, the population is regarded as comprising a number of primary sampling units, each of which comprises secondary sampling units. Several chained sample, Several statical unit.

Advantage: - No compel using of population required

Most feasible approach for large population

Disadvantage: - Several sampling list

Sampling error difficult to measure

Sampling Error: - No sample is perfect mirror image of the population Measurement should be precise and unambiguous in an ideal research study.

1. Magnitude of error can be measured in probability samples

2. Expressed by standard error of mean, proportion, differences

3. Function of sample size and variability in measurement.

The following are the possible sources of error in measurement:

(a)   Respondent: At times the respondent may be reluctant '0 express strong negative feelings or it is just possible that he may have very little knowledge. but may not admit his ignorance.

(b)  Situation: Situational factors may also come in the way of correct measurement.

(c)   Measurer: The interviewer can distort responses by rewording or reordering questions.

(d)  Instrument: Error may arise because of the defective measuring instrument. T When we make wrong calculation, follow wrong method, draw wrong conclusion, etc., they are known as mistake.

(i) Errors of Origin; for example, errors arise on account of inappropriate definitions of statistical units, defective questionnaire etc.

(ii) Errors of inadequacy; for example, incomplete data, inadequacy of number of items in the sample, etc.

(iii) Errors of interpretation; for example, errors committed by statisticians.

(iv) Errors of manipulation; for example, clerical errors, arithmetical slips, etc

Sampling and Non-sampling Errors: The error arising due to drawing inferences about the population on the basis of few observations (sampling) is termed sampling error. Clearly, the sampling error in this sense is non-existent in complete enumeration survey, since the whole population is surveyed. However, the error mainly arising at the stage of ascertainment and processing of data, which are termed non-sampling errors, are' common both in complete enumeration and sample surveys

Bias and Unbiased Errors: The errors that arise due to a bias or prejudice on the part of the information or enumerator or investigator in selecting, estimating or measuring instruments are called biased errors. Errors, which arise in the normal course of investigation or enumeration on account of chance, are called unbiased errors.

Sample Size

This refers to the member of items to be selected from the universe to constitute a sample. The size of sample should neither be excessively large, nor too small. It should be optimum. An optimum sample is one which fulfils the requirements of efficiency, representativeness, reliability and flexibility. The following factors should be considered while deciding the sample size:

1. The size of the universe: The larger the size of the universe, the bigger should be the sample size.

2. The resources available: If the resources available are vast a larger sample size could be taken. However, in most cases resources constitute a big constraint on sample size.

3. The degree of accuracy or precision desired: The greater the degree of accuracy desired the larger should be the sample size. However, it does not necessarily mean that bigger samples always ensure greater accuracy. If a sample is selected by experts by following scientific method, it may ensure better results even when it is small compared to a situation in which it is sample size is selected by inexperienced people.

 4. Homogeneity or heterogeneity of the universe: If the universe consists of homogeneous units a small sample may serve the purpose, but if the universe consists of heterogeneous units a large sample may be inevitable.

5. Nature of Study: For an intensive and continuous study a small sample may be suitable. But for studies which are not likely to be repeated and are quite extensive in nature, it may be necessary to take a larger sample size

6. Method of sampling adopted: The size of sample is also influenced by the type of sampling plan adopted. For example, if the sample is a simple random sample, it may necessitate a bigger sample size. However, in a properly drawn stratified sampling plan, even a small sample may give better results.

7. Nature of respondents: Where it is expected a large number of respondents, will not cooperate and send back the questionnaires, a larger sample should be selected.

Determination of Sample Size

After deciding the degree of precision and confidence level, the next step is to determine the sample size. The formulas to determine sample size are based on results of the sample responses. The important formulas are:

I.If the results are reported as proportions of the sample responses, then following formula is used:

 

where A = Accuracy desired

Z = Confidence level

N = Population size

σ = Standard deviation of the attribute of interest.

If the researcher wishes to report result in a variety of ways, the following formula may be more useful

 

Where, n = Sample size

Z = Level of confidence

 N = Population size

d = Accuracy precision level as 0.0 1,0.05 etc.

Sources of Sampling and Non-Sampling Errors

l. Sampling Errors: This error is attributed to fluctuations of sampling. Sampling error is due to the fact that only a subset of the population has been used to estimate the population parameters and draw inferences in a sample survey and is completely absent in census method. The following are the sources of sampling errors:

(I) Faulty selection of the sample: Some of the bias is introduced by the use of defective sampling technique for the selection of a sample, e.g. purposive or judgement sampling in which the investigator deliberately selects a representative sample to obtain certain result.

(2) Substitution: Substitution of an item in place of one chosen in random sample sometimes lead to some bias because the characteristics possessed by the substituted unit will usually be different from those possessed by the original unit.

(3) Error due to bias in the estimation method: Improper choice of the estimation techniques might introduce the error.

(4) No response: If all the items to be included in the sample are not covered, there will be bias even though no substitution has been attempted.

(5) Variability of the population: Sampling error also depends on the variability or heterogeneity of the population to be sampled.

2. Non-sampling Errors: They are due to certain causes which can be traced and may arise at any stage of the enquiry, viz. planning and execution of the survey and collection, processing and analysis of the data. Non-sampling errors are thus present both in census and sampling surveys. Some of the important factors responsible for non-sampling errors in any survey are:

1. Faulty planning including vague and faulty definitions of the population or the statistical units to be used, incomplete list of population members.

2. Vague and imperfect questionnaire which might result in incomplete or wrong information.

3. Defective methods of interviewing and asking questions.

4.  Vagueness about the type of data to be collected. Exaggerated or wrong answers to the questions which appeal to the pride or prestige or self. interest of the respondents.

5. Personal bias of the investigator.

6. Lack of trained and qualified investigators and lack of supervisory staff.

7. Failure of respondent’s memory to recall the events or happening in the past

Essentials of a Good Sample: If the sample results are to have any worthwhile meaning, it is necessary that a sample possesses the following essentials:

(i) Representativeness: A sample should be so selected that it truly represents the universe otherwise the results obtained may be misleading. To ensure representativeness the random method of selection should be used.

(ii) Adequacy: The size of sample should be adequate otherwise it may not represent the characteristics of the universe.

(iii) Independence: All items of the sample should be selected independently of one another and all items of the universe should have the same chance of being selected in the sample. By independence of selection, we mean that the selection of a particular item in one draw has influence on the probabilities of selection in any other draw.

(iv) Homogeneity: When we talk the homogeneity, we mean that there is no basic difference in the nature of units of the universe and that of the sample. If two samples from the same universe are taken, they should give more or less the same result

Calculating Sample Size and Power 

Step in estimating sample size: -

1. Identify major study variable

2. Determine type of estimate

3. Indicate expected frequency of factor of interest

4. Decide on desired precision of estimate

5. Decide on acceptable risk that estimate will fall outside it’s real population value

6. Adjust for population size

7. Adjust for estimate design effect

8. Adjust for expected response rate

α: - the significance level of test: -

The probability of rejecting the null hypothesis when it is true or the probability of making a type I error

Confidence level: - the probability of that an estimate of a population parameter is within a certain specified limit of the true value. Commonly denoted as (1-alpha)

β: - the probability of failing to reject the null hypothesis when it is false or the probability of making a type II error 

Power: - The probability of correctly rejecting the null hypothesis. When it is false commonly denote as (1-beta) 

Precision: - A measure of how close an estimate is to the true value of population parameter. It may be expressed in absolute term or relative to the estimate

Sample size required for estimating population means: -

D= reliability coefficient x standard error

(d)  Is used to calculate sample

 

Sampling distribution

When solved for n given

 

 Ïƒ = stand deviation

z= 1.96=z α = 1.96(normal distribution)

α = 95%

d= internal or desired length on both sides

n= sample size 

Population standard deviation: - σ

1. Aka population variance

2. To estimate when not given a pilot survey can be conducted

3. From pervious studies

4. In normal distribution the range r is approximately equal to 6 standard deviation

σ = R/6

Sample size required for estimating proportion

P= proportion in the population

A= pilot sample can be used to calculate p

From previous study if impossible to come with better estimate set P= 0.5 in formula to yield max value of n.

 

q = (1-p)

d= absolute precision

Design effect: - A bias in the variance introduced in the sampling design by selecting subject whose result are not independent from each other, relative changes in variance due to use of cluster

1. design effect can be calculated after study completion, but should be accounted at design stage

2. Design effect is or no when taking sample random sample it is varies cluster sampling which is usually estimated to be in cluster survey

3. Multiple sample size by design effect

Sample size for analytic studies

1.Designed value for probability of α and β

2. Cohort

3. Proportion of exposed/risk (p1)

4. Proportion of non-exposed (p0)

 

10% rule: -

1. Note that sample size estimate should be interpreted as providing merely a minimum estimate of the sample size necessary for the study.

2. Formula takes into account only the overall association between exposure & disease

3. 10% rule increase sample size 10% for each confounder variable added

References:

1.     Moser A, Korstjens I. Series: Practical guidance to qualitative research. Part 3: Sampling, data collection and analysis. Eur J Gen Pract. 2018 Dec;24(1):9-18. doi: 10.1080/13814788.2017.1375091. Epub 2017 Dec 4. PMID: 29199486; PMCID: PMC5774281.

2.     Carr LT. The strengths and weaknesses of quantitative and qualitative research: what method for nursing? J Adv Nurs. 1994 Oct;20(4):716-21. doi: 10.1046/j.1365-2648.1994.20040716.x. PMID: 7822608.

3.     Korstjens I, Moser A. Series: Practical guidance to qualitative research. Part 2: Context, research questions and designs. Eur J Gen Pract. 2017 Dec;23(1):274-279. doi: 10.1080/13814788.2017.1375090. PMID: 29185826; PMCID: PMC8816399.

4.     Gelling L. Stages in the research process. Nurs Stand. 2015 Mar 4;29(27):44-9. doi: 10.7748/ns.29.27.44.e8745. PMID: 25736674.

5.     Vasileiou K, Barnett J, Thorpe S, Young T. Characterising and justifying sample size sufficiency in interview-based studies: systematic analysis of qualitative health research over a 15-year period. BMC Med Res Methodol. 2018 Nov 21;18(1):148. doi: 10.1186/s12874-018-0594-7. PMID: 30463515; PMCID: PMC6249736.

Tuckett AG. Qualitative research sampling: the very real complexities. Nurse Res. 2004;12(1):47-61. doi: 10.7748/nr2004.07.12.1.47.c5930. PMID: 15493214