Researchers are often interested in using a pooled shRNA library for genome-wide RNAi screening to cast a very “wide and unbiased net” to identify any and all genes functionally involved in some pathway. Although it is not difficult to make an shRNA library targeting all human or mouse genes, it is difficult to comprehensively screen using such a library. Careful consideration of starting cell numbers and handling of cells during propagation is essential to ensure thorough screening of pooled shRNA expression libraries, minimize false negatives, and obtain consistent and reproducible results.

Library Complexity and Number of Cells

First, there is an issue of library complexity since it is necessary to have several shRNAs designed to target each gene. The effectiveness of validated shRNA varies from cell-to-cell. For these reasons, it is necessary to incorporate several shRNAs for each gene to ensure reasonable knockdown of a high percentage of targets. Cellecta typically designs 5-10 shRNAs against each target gene depending on the design of the library, so at least 25,000 shRNAs are required to target 2,500-5,000 genes. A library targeting the entire human genome of approximately 20,000 genes requires approximately 115,000 individual shRNA constructs. While it is not particularly difficult to construct libraries of this complexity, this number of unique shRNA sequences can produce technical challenges with some types of screens.

Number of Starting Cells and Representation

Pooled shRNA library screens require quantification of changes in the fraction of cells bearing each shRNA sequence in selected vs. control cells (or starting library). A “hit” occurs when selected cells have significantly more or fewer cells bearing a particular shRNA sequence. Whether one is looking at enrichment of shRNA sequences in the selected cell population vs. control (positive selection) or depletion of shRNA sequenced in selected cell population vs. control (negative selection), it is critical that the screens begin with sufficient numbers of cells expressing each shRNA to ensure measured changes in the fraction of cells bearing any given shRNA sequence are statistically significant. This means that, if there are very low numbers of cells bearing specific shRNA sequences at the start of the screen, small random changes in a drifting population may be difficult to differentiate from significant trends.

Simply put, a loss of 2 cells is a 20% change if there are only 10 initially vs. 2% if there are 100. For this reason, a least a few hundred cells need to be infected with each shRNA to initiate a good screening. This is demonstrated in the Reproducibility of Triplicate figure below where starting with a smaller population of just 50 cells per shRNA (third bar) leads to significantly more variation than starting with a population of 200 cells per shRNA (first bar).

To ensure adequate representation of the whole library in the initial population, start a screen by infecting at least 200 times more cells than the complexity of the library. For a library with 25,000 shRNAs, the starting population should be at least 5 million infected cells, and for a library with 55,000 shRNAs, the starting population should be at least 11 million infected cells.

Multiplicity of Infection

For pooled shRNA screens, it is important to have 2-3 times more cells than infecting viral particles (i.e., a multiplicity of infection [MOI] of 0.3-0.5) to ensure that the majority of cells are only infected with one shRNA-carrying virus, so you need to have 2-3 times more cells than the number targeted for infection. Thus, 6-8 million cells are needed to start a screen with libraries of 25,000 shRNAs, and a whole genome library of 150,000 shRNAs would require 60-90 million cells. Since each screen should be done in duplicate, or better, triplicate, the number of cells needed makes a full genome screen with a redundant shRNA library challenging.

The lower the MOI, the more cells you need to start the screen so it is tempting to use a high MOI. However, you should consider that a higher MOI produces a higher percentage of infected cells bearing two or more different shRNA constructs. For most RNAi screens, we recommend optimizing conditions and performing genetic screen transductions at no more than 0.5 MOI (ca. 40% transduction efficiency) which balances these two considerations. Please note that to accurately calculate the MOI, it is critical to determine the library titer directly in your target cells prior to beginning your experiment. Once conditions are established to achieve ~40% transduction efficiency in the titering assay, scale up all conditions proportionately to accommodate the larger amount of transduced cells needed for the genetic screen.

Representation and Cell Propagation Techniques

Finally, to ensure a comprehensive screen, it is not simply sufficient to start with the right amount of cells. During the screening process, incorrect propagation of the cells can completely undercut the representation set up at the initiation of the screen. This is especially true for a negative selection screen, such as a viability screen where one is interested in identifying shRNA that kill or inhibit proliferation of cells, and, therefore, drop out of the population. It is critical to maintain the full library representation that was initially used at the start of the screen.

If a portion of propagating cells are removed during propagation (e.g. cells are split), the representation of the library can be skewed in the sample which introduces significant random noise. This effect is readily seen in the second bar in the Reproducibility of Triplicates figure where the effect of starting with sufficient cells (i.e. 200-fold library complexity) is completely undercut by splitting cells during propagation so that that the final count of cells after 10 days is the same as the initial number of transduced cells (i.e. 200- fold library complexity). The correlation between triplicates falls dramatically when the cells are split to this degree. For this reason if cells are ever to be discarded or samples split at any time during the screen, the number of remaining cells in each sample should always exceed the complexity of the library by at least 1,000-fold, as shown in the first bar of the figure. For example, keep at least 27 million cells after every splitting step, for a 27K library. Also, before splitting or discarding, make sure you first pool all cells from the same replicate together.

Modular Approach to Genome-Wide Screening

Library representation is often overlooked, especially when the desire is for large-scale unbiased screens. However, without careful consideration in designing screening procedures that reflect the complexity of the library, results of these large-scale screens can produce relatively meaningless data with anecdotal results at best. So, what about genome-wide screening? Our approach is to provide modules, each targeting approximately 5,000 genes with 27,500 shRNA in our Three- or Two-Module DECIPHER Human and Mouse libraries, or targeting 6,500 with 55,000 shRNA in our Three-Module Human Genome-Wide (hGW) library. These modules enable comprehensive genome-wide screens with manageable numbers of cells for negative selection screens. For cases where it is practical to work with larger numbers of cells, for example some positive screens, the modules of the hGW library can be combined to make a larger library since they contain non-overlapping barcodes (with the exception being Modules 1 and 3 of the DECIPHER Human shRNA Library.

Last modified: 10 January 2019

Need more help with this?
Contact Us

Thanks for your feedback.