<html lang="en">

  <head>
    <title>Information content differentiates enhancers from silencers in mouse photoreceptors
    </title>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link href="https://unpkg.com/@stencila/thema@2/dist/themes/elife/styles.css" rel="stylesheet">
    <script src="https://unpkg.com/@stencila/thema@2/dist/themes/elife/index.js"
      type="text/javascript"></script>
    <script
      src="https://unpkg.com/@stencila/components@&lt;=1/dist/stencila-components/stencila-components.esm.js"
      type="module"></script>
    <script
      src="https://unpkg.com/@stencila/components@&lt;=1/dist/stencila-components/stencila-components.js"
      type="text/javascript" nomodule=""></script>
  </head>

  <body>
    <main role="main">
      <article itemscope="" itemtype="http://schema.org/Article" data-itemscope="root">
        <h1 itemprop="headline">Information content differentiates enhancers from silencers in mouse
          photoreceptors</h1>
        <meta itemprop="image"
          content="https://via.placeholder.com/1200x714/dbdbdb/4a4a4a.png?text=Information%20content%20differentiates%20enhancers%20from%20silencers%20in%20mouse%20photoreceptors">
        <ol data-itemprop="authors">
          <li itemscope="" itemtype="http://schema.org/Person" itemprop="author">
            <meta itemprop="name" content="Ryan Z Friedman"><span data-itemprop="givenNames"><span
                itemprop="givenName">Ryan</span><span itemprop="givenName">Z</span></span><span
              data-itemprop="familyNames"><span itemprop="familyName">Friedman</span></span><span
              data-itemprop="affiliations"><a itemprop="affiliation"
                href="#author-organization-1">1</a><a itemprop="affiliation"
                href="#author-organization-2">2</a></span>
          </li>
          <li itemscope="" itemtype="http://schema.org/Person" itemprop="author">
            <meta itemprop="name" content="David M Granas"><span data-itemprop="givenNames"><span
                itemprop="givenName">David</span><span itemprop="givenName">M</span></span><span
              data-itemprop="familyNames"><span itemprop="familyName">Granas</span></span><span
              data-itemprop="affiliations"><a itemprop="affiliation"
                href="#author-organization-1">1</a><a itemprop="affiliation"
                href="#author-organization-2">2</a></span>
          </li>
          <li itemscope="" itemtype="http://schema.org/Person" itemprop="author">
            <meta itemprop="name" content="Connie A Myers"><span data-itemprop="givenNames"><span
                itemprop="givenName">Connie</span><span itemprop="givenName">A</span></span><span
              data-itemprop="familyNames"><span itemprop="familyName">Myers</span></span><span
              data-itemprop="affiliations"><a itemprop="affiliation"
                href="#author-organization-3">3</a></span>
          </li>
          <li itemscope="" itemtype="http://schema.org/Person" itemprop="author">
            <meta itemprop="name" content="Joseph C Corbo"><span data-itemprop="givenNames"><span
                itemprop="givenName">Joseph</span><span itemprop="givenName">C</span></span><span
              data-itemprop="familyNames"><span itemprop="familyName">Corbo</span></span><span
              data-itemprop="affiliations"><a itemprop="affiliation"
                href="#author-organization-3">3</a></span>
          </li>
          <li itemscope="" itemtype="http://schema.org/Person" itemprop="author">
            <meta itemprop="name" content="Barak A Cohen"><span data-itemprop="givenNames"><span
                itemprop="givenName">Barak</span><span itemprop="givenName">A</span></span><span
              data-itemprop="familyNames"><span itemprop="familyName">Cohen</span></span><span
              data-itemprop="affiliations"><a itemprop="affiliation"
                href="#author-organization-1">1</a><a itemprop="affiliation"
                href="#author-organization-2">2</a></span>
          </li>
          <li itemscope="" itemtype="http://schema.org/Person" itemprop="author">
            <meta itemprop="name" content="Michael A White"><span data-itemprop="givenNames"><span
                itemprop="givenName">Michael</span><span itemprop="givenName">A</span></span><span
              data-itemprop="familyNames"><span itemprop="familyName">White</span></span><span
              data-itemprop="emails"><a itemprop="email"
                href="mailto:mawhite@wustl.edu">mawhite@wustl.edu</a></span><span
              data-itemprop="affiliations"><a itemprop="affiliation"
                href="#author-organization-1">1</a><a itemprop="affiliation"
                href="#author-organization-2">2</a></span>
          </li>
        </ol>
        <ol data-itemprop="affiliations">
          <li itemscope="" itemtype="http://schema.org/Organization" itemid="#author-organization-1"
            id="author-organization-1"><span itemprop="name">Edison Family Center for Genome
              Sciences and Systems Biology, Washington University School of Medicine</span><address
              itemscope="" itemtype="http://schema.org/PostalAddress" itemprop="address"><span
                itemprop="addressLocality">St. Louis</span><span itemprop="addressCountry">United
                States</span></address></li>
          <li itemscope="" itemtype="http://schema.org/Organization" itemid="#author-organization-2"
            id="author-organization-2"><span itemprop="name">Department of Genetics, Washington
              University School of Medicine</span><address itemscope=""
              itemtype="http://schema.org/PostalAddress" itemprop="address"><span
                itemprop="addressLocality">St. Louis</span><span itemprop="addressCountry">United
                States</span></address></li>
          <li itemscope="" itemtype="http://schema.org/Organization" itemid="#author-organization-3"
            id="author-organization-3"><span itemprop="name">Department of Pathology and Immunology,
              Washington University School of Medicine</span><address itemscope=""
              itemtype="http://schema.org/PostalAddress" itemprop="address"><span
                itemprop="addressLocality">St Louis</span><span itemprop="addressCountry">United
                States</span></address></li>
        </ol><span itemscope="" itemtype="http://schema.org/Organization" itemprop="publisher">
          <meta itemprop="name" content="Unknown"><span itemscope=""
            itemtype="http://schema.org/ImageObject" itemprop="logo">
            <meta itemprop="url"
              content="https://via.placeholder.com/600x60/dbdbdb/4a4a4a.png?text=Unknown">
          </span>
        </span><time itemprop="datePublished" datetime="2021-09-06">2021-09-06</time>
        <ul data-itemprop="genre">
          <li itemprop="genre">Research Article</li>
        </ul>
        <ul data-itemprop="about">
          <li itemscope="" itemtype="http://schema.org/DefinedTerm" itemprop="about"><span
              itemprop="name">Computational and Systems Biology</span></li>
          <li itemscope="" itemtype="http://schema.org/DefinedTerm" itemprop="about"><span
              itemprop="name">Genetics and Genomics</span></li>
        </ul>
        <ul data-itemprop="keywords">
          <li itemprop="keywords">enhancers</li>
          <li itemprop="keywords">silencers</li>
          <li itemprop="keywords">information theory</li>
          <li itemprop="keywords">massively parallel reporter assays</li>
          <li itemprop="keywords">Mouse</li>
        </ul>
        <ul data-itemprop="identifiers">
          <li itemscope="" itemtype="http://schema.org/PropertyValue" itemprop="identifier">
            <meta itemprop="propertyID"
              content="https://registry.identifiers.org/registry/publisher-id"><span
              itemprop="name">publisher-id</span><span itemprop="value"
              data-itemtype="http://schema.org/Number">67403</span>
          </li>
          <li itemscope="" itemtype="http://schema.org/PropertyValue" itemprop="identifier">
            <meta itemprop="propertyID" content="https://registry.identifiers.org/registry/doi">
            <span itemprop="name">doi</span><span itemprop="value">10.7554/eLife.67403</span>
          </li>
          <li itemscope="" itemtype="http://schema.org/PropertyValue" itemprop="identifier">
            <meta itemprop="propertyID"
              content="https://registry.identifiers.org/registry/elocation-id"><span
              itemprop="name">elocation-id</span><span itemprop="value">e67403</span>
          </li>
        </ul>
        <section data-itemprop="description">
          <h2 data-itemtype="http://schema.stenci.la/Heading">Abstract</h2>
          <meta itemprop="description"
            content="Enhancers and silencers often depend on the same transcription factors (TFs) and are conflated in genomic assays of TF binding or chromatin state. To identify sequence features that distinguish enhancers and silencers, we assayed massively parallel reporter libraries of genomic sequences targeted by the photoreceptor TF cone-rod homeobox (CRX) in mouse retinas. Both enhancers and silencers contain more TF motifs than inactive sequences, but relative to silencers, enhancers contain motifs from a more diverse collection of TFs. We developed a measure of information content that describes the number and diversity of motifs in a sequence and found that, while both enhancers and silencers depend on CRX motifs, enhancers have higher information content. The ability of information content to distinguish enhancers and silencers targeted by the same TF illustrates how motif context determines the activity of  cis -regulatory sequences.">
          <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Enhancers and silencers often
            depend on the same transcription factors (TFs) and are conflated in genomic assays of TF
            binding or chromatin state. To identify sequence features that distinguish enhancers and
            silencers, we assayed massively parallel reporter libraries of genomic sequences
            targeted by the photoreceptor TF cone-rod homeobox (CRX) in mouse retinas. Both
            enhancers and silencers contain more TF motifs than inactive sequences, but relative to
            silencers, enhancers contain motifs from a more diverse collection of TFs. We developed
            a measure of information content that describes the number and diversity of motifs in a
            sequence and found that, while both enhancers and silencers depend on CRX motifs,
            enhancers have higher information content. The ability of information content to
            distinguish enhancers and silencers targeted by the same TF illustrates how motif
            context determines the activity of <em itemscope=""
              itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory sequences.</p>
        </section>
        <h2 itemscope="" itemtype="http://schema.stenci.la/Heading" id="introduction">Introduction
        </h2>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Active <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory sequences in the genome
          are characterized by accessible chromatin and specific histone modifications, which
          reflect the action of DNA-binding transcription factors (TFs) that recognize specific
          sequence motifs and recruit chromatin-modifying enzymes <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib44"><span>44</span><span>Klemm et
                al.</span><span>2019</span></a></cite>. These epigenetic hallmarks of active
          chromatin are routinely used to train machine learning models that predict <em
            itemscope="" itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory sequences,
          based on the assumption that such epigenetic marks are reliable predictors of genuine <em
            itemscope="" itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory sequences
          <span itemscope="" itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib13"><span>13</span><span>Ernst
                  and Kellis</span><span>2012</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib19"><span>19</span><span>Ghandi
                  et al.</span><span>2014</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib27"><span>27</span><span>Hoffman
                  et al.</span><span>2012</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib41"><span>41</span><span>Kelley
                  et al.</span><span>2016</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib50"><span>50</span><span>Lee et
                  al.</span><span>2011</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib77"><span>77</span><span>Sethi et
                  al.</span><span>2020</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib90"><span>90</span><span>Zhou and
                  Troyanskaya</span><span>2015</span></a></cite></span>. However, results from
          functional assays show that many predicted <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory sequences exhibit little
          or no <em itemscope="" itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory
          activity. Typically, 50% or more of predicted <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory sequences fail to drive
          expression in massively parallel reporter assays (MPRAs) <span itemscope=""
            itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib58"><span>58</span><span>Moore et
                  al.</span><span>2020</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a
                href="#bib48"><span>48</span><span>Kwasnieski et
                  al.</span><span>2014</span></a></cite></span>, indicating that an active chromatin
          state is not sufficient to reliably identify <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory sequences.</p>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Another challenge is that
          enhancers and silencers are difficult to distinguish by chromatin accessibility or
          epigenetic state <span itemscope="" itemtype="http://schema.stenci.la/CiteGroup"><cite
              itemscope="" itemtype="http://schema.stenci.la/Cite"><a
                href="#bib11"><span>11</span><span>Doni Jayavelu et
                  al.</span><span>2020</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a
                href="#bib20"><span>20</span><span>Gisselbrecht et
                  al.</span><span>2020</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib62"><span>62</span><span>Pang and
                  Snyder</span><span>2020</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a
                href="#bib66"><span>66</span><span>Petrykowska et
                  al.</span><span>2008</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib76"><span>76</span><span>Segert
                  et al.</span><span>2021</span></a></cite></span>, and thus computational
          predictions of _cis-_regulatory sequences often do not differentiate between enhancers and
          silencers. Silencers are often enhancers in other cell types <span itemscope=""
            itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib5"><span>5</span><span>Brand et
                  al.</span><span>1987</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib11"><span>11</span><span>Doni
                  Jayavelu et al.</span><span>2020</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a
                href="#bib20"><span>20</span><span>Gisselbrecht et
                  al.</span><span>2020</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib30"><span>30</span><span>Huang et
                  al.</span><span>2021</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib37"><span>37</span><span>Jiang et
                  al.</span><span>1993</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib61"><span>61</span><span>Ngan et
                  al.</span><span>2020</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib62"><span>62</span><span>Pang and
                  Snyder</span><span>2020</span></a></cite></span>, reside in open chromatin <span
            itemscope="" itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib11"><span>11</span><span>Doni
                  Jayavelu et al.</span><span>2020</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib29"><span>29</span><span>Huang et
                  al.</span><span>2019</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib30"><span>30</span><span>Huang et
                  al.</span><span>2021</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib62"><span>62</span><span>Pang and
                  Snyder</span><span>2020</span></a></cite></span>, sometimes bear epigenetic marks
          of active enhancers <span itemscope="" itemtype="http://schema.stenci.la/CiteGroup"><cite
              itemscope="" itemtype="http://schema.stenci.la/Cite"><a
                href="#bib14"><span>14</span><span>Fan et
                  al.</span><span>2016</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib30"><span>30</span><span>Huang et
                  al.</span><span>2021</span></a></cite></span>, and can be bound by TFs that also
          act on enhancers in the same cell type <span itemscope=""
            itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib1"><span>1</span><span>Alexandre
                  and Vincent</span><span>2003</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib21"><span>21</span><span>Grass et
                  al.</span><span>2003</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib30"><span>30</span><span>Huang et
                  al.</span><span>2021</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib35"><span>35</span><span>Iype et
                  al.</span><span>2004</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib37"><span>37</span><span>Jiang et
                  al.</span><span>1993</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib52"><span>52</span><span>Liu et
                  al.</span><span>2014</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a
                href="#bib53"><span>53</span><span>Martínez-Montañés et
                  al.</span><span>2013</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib65"><span>65</span><span>Peng et
                  al.</span><span>2005</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib69"><span>69</span><span>Rachmin
                  et al.</span><span>2015</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib70"><span>70</span><span>Rister
                  et al.</span><span>2015</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib80"><span>80</span><span>Stampfel
                  et al.</span><span>2015</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib85"><span>85</span><span>White et
                  al.</span><span>2013</span></a></cite></span>. As a result, enhancers and
          silencers share similar sequence features, and understanding how they are distinguished in
          a particular cell type remains an important challenge <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib76"><span>76</span><span>Segert et
                al.</span><span>2021</span></a></cite>.</p>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">The TF cone-rod homeobox (CRX)
          controls selective gene expression in a number of different photoreceptor and bipolar cell
          types in the retina <span itemscope="" itemtype="http://schema.stenci.la/CiteGroup"><cite
              itemscope="" itemtype="http://schema.stenci.la/Cite"><a
                href="#bib6"><span>6</span><span>Chen et al.</span><span>1997</span></a></cite><cite
              itemscope="" itemtype="http://schema.stenci.la/Cite"><a
                href="#bib17"><span>17</span><span>Freund et
                  al.</span><span>1997</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib18"><span>18</span><span>Furukawa
                  et al.</span><span>1997</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib60"><span>60</span><span>Murphy
                  et al.</span><span>2019</span></a></cite></span>. These cell types derive from the
          same progenitor cell population <span itemscope=""
            itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib45"><span>45</span><span>Koike et
                  al.</span><span>2007</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib83"><span>83</span><span>Wang et
                  al.</span><span>2014</span></a></cite></span>, but they exhibit divergent,
          CRX-directed transcriptional programs <span itemscope=""
            itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib9"><span>9</span><span>Corbo et
                  al.</span><span>2010</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib25"><span>25</span><span>Hennig
                  et al.</span><span>2008</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib31"><span>31</span><span>Hughes
                  et al.</span><span>2017</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib60"><span>60</span><span>Murphy
                  et al.</span><span>2019</span></a></cite></span>. CRX cooperates with cell
          type-specific co-factors to selectively activate and repress different genes in different
          cell types and is required for differentiation of rod and cone photoreceptors <span
            itemscope="" itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib7"><span>7</span><span>Chen et
                  al.</span><span>2005</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib23"><span>23</span><span>Hao et
                  al.</span><span>2012</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib25"><span>25</span><span>Hennig
                  et al.</span><span>2008</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib28"><span>28</span><span>Hsiau et
                  al.</span><span>2007</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib34"><span>34</span><span>Irie et
                  al.</span><span>2015</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib43"><span>43</span><span>Kimura
                  et al.</span><span>2000</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib51"><span>51</span><span>Lerner
                  et al.</span><span>2005</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib55"><span>55</span><span>Mears et
                  al.</span><span>2001</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib56"><span>56</span><span>Mitton
                  et al.</span><span>2000</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib60"><span>60</span><span>Murphy
                  et al.</span><span>2019</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib65"><span>65</span><span>Peng et
                  al.</span><span>2005</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib75"><span>75</span><span>Sanuki
                  et al.</span><span>2010</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib79"><span>79</span><span>Srinivas
                  et al.</span><span>2006</span></a></cite></span>. However, the sequence features
          that define CRX-targeted enhancers vs. silencers in the retina are largely unknown.</p>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">We previously found that a
          significant minority of CRX-bound sequences act as silencers in an MPRA conducted in live
          mouse retinas <cite itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib85"><span>85</span><span>White et al.</span><span>2013</span></a></cite>,
          and that silencer activity requires CRX <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib86"><span>86</span><span>White et
                al.</span><span>2016</span></a></cite>. Here, we extend our analysis by testing
          thousands of additional candidate <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory sequences. We show that
          while regions of accessible chromatin and CRX binding exhibit a range of <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory activity, enhancers and
          silencers contain more TF motifs than inactive sequences, and that enhancers are
          distinguished from silencers by a higher diversity of TF motifs. We capture the
          differences between these sequence classes with a new metric, motif information content
          (Boltzmann entropy), that considers only the number and diversity of TF motifs in a
          candidate <em itemscope="" itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory
          sequence. Our results suggest that CRX-targeted enhancers are defined by a flexible
          regulatory grammar and demonstrate how differences in motif information content encode
          functional differences between genomic loci with similar chromatin states.</p>
        <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
          data-execution_count="1" data-programminglanguage="python">
          <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
            slot="text"><code># Setup imports for analysis
import os
import sys
import itertools

import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from mpl_toolkits.axes_grid1 import make_axes_locatable
from scipy import stats
from sklearn.feature_selection import RFE, RFECV
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import StratifiedKFold
from pybedtools import BedTool
from IPython.display import display
import logomaker

sys.path.insert(0, &quot;utils&quot;)
from utils import fasta_seq_parse_manip, gkmsvm, modeling, plot_utils, predicted_occupancy, quality_control, sequence_annotation_processing

data_dir = os.path.join(&quot;Data&quot;)
figures_dir = os.path.join(&quot;Figures&quot;)

# Load in all sequences
all_seqs = fasta_seq_parse_manip.read_fasta(os.path.join(data_dir, &quot;library1And2.fasta&quot;))
# Drop scrambled sequences -- we don&#39;t need them for any analysis
all_seqs = all_seqs[~(all_seqs.index.str.contains(&quot;scr&quot;))]</code></pre>
        </stencila-code-chunk>
        <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
          data-execution_count="2" data-programminglanguage="python">
          <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
            slot="text"><code>plot_utils.set_manuscript_params()</code></pre>
        </stencila-code-chunk>
        <h2 itemscope="" itemtype="http://schema.stenci.la/Heading" id="results">Results</h2>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">We tested the activities of
          4844 putative CRX-targeted <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory sequences (CRX-targeted
          sequences) by MPRA in live retinas. The MPRA libraries consist of 164 bp genomic sequences
          centered on the best match to the CRX position weight matrix (PWM) <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib49"><span>49</span><span>Lee et
                al.</span><span>2010</span></a></cite> whenever a CRX motif is present, and matched
          sequences in which all CRX motifs were abolished by point mutation (Materials and
          methods). The MPRA libraries include 3299 CRX-bound sequences identified by ChIP-seq in
          the adult retina <cite itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib9"><span>9</span><span>Corbo et al.</span><span>2010</span></a></cite> and
          1545 sequences that do not have measurable CRX binding in the adult retina but reside in
          accessible chromatin in adult photoreceptors <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib31"><span>31</span><span>Hughes et
                al.</span><span>2017</span></a></cite> and have the H3K27ac enhancer mark in
          postnatal day 14 (P14) retina <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib72"><span>72</span><span>Ruzycki et
                al.</span><span>2018</span></a></cite> (‘ATAC-seq peaks’). We split the sequences
          across two plasmid libraries, each of which contained the same 150 scrambled sequences as
          internal controls (<a href="#supp1" itemscope=""
            itemtype="http://schema.stenci.la/Link">Supplementary files 1 and 2</a>). We cloned
          sequences upstream of the rod photoreceptor-specific <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">Rhodopsin</em> (<em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">Rho</em>) promoter and a <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">DsRed</em> reporter gene, electroporated
          libraries into explanted mouse retinas at P0 in triplicate, harvested the retinas at P8,
          and then sequenced the RNA and input DNA plasmid pool. The data is highly reproducible
          across replicates (R<sup itemscope="" itemtype="http://schema.stenci.la/Superscript"><span
              data-itemtype="http://schema.org/Number">2</span></sup> &gt; 0.96, <a href="#fig1s1"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 1—figure supplement 1</a>).
          After activity scores were calculated and normalized to the basal <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">Rho</em> promoter, the two libraries were
          well calibrated and merged together (two-sample Kolmogorov-Smirnov test p = 0.09, <a
            href="#fig1s2" itemscope="" itemtype="http://schema.stenci.la/Link">Figure 1—figure
            supplement 2</a>, <a href="#supp3" itemscope=""
            itemtype="http://schema.stenci.la/Link">Supplementary file 3</a>, and Materials and
          methods).</p>
        <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
          data-execution_count="3" data-programminglanguage="python">
          <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
            slot="text"><code># Process data for the Rho promoter: convert counts into activity scores for each sequence
library_names = [&quot;library1&quot;, &quot;library2&quot;]
rho_activity_data = {} # {library name: pd.DataFrame}
barcode_count_dir = os.path.join(data_dir, &quot;Rhodopsin&quot;)

for library in library_names:
    print(f&quot;Processing data for {library} with the Rho promoter...&quot;)
    # File names
    barcode_count_files = [
        os.path.join(barcode_count_dir, f&quot;{library}{sample}.counts&quot;)
        for sample in [&quot;Plasmid&quot;, &quot;Rna1&quot;, &quot;Rna2&quot;, &quot;Rna3&quot;]
    ]
    
    # Masks and metadata for downstream functions
    sample_labels = np.array([&quot;DNA&quot;, &quot;RNA1&quot;, &quot;RNA2&quot;, &quot;RNA3&quot;])
    sample_rna_mask = np.array([False, True, True, True])
    rna_labels = sample_labels[sample_rna_mask]
    dna_labels = sample_labels[np.logical_not(sample_rna_mask)]
    n_samples = len(sample_labels)
    n_rna_samples = len(rna_labels)
    n_dna_samples = len(dna_labels)
    n_barcodes_per_sequence = 3
    
    # Read in the barcode counts
    print(&quot;Reading in barcode counts.&quot;)
    all_sample_counts_df = quality_control.read_bc_count_files(barcode_count_files, sample_labels)
    display(all_sample_counts_df.head())
    
    # Remove barcodes that are detection-limited.
    # Barcodes below the DNA cutoff are NaN (because they are missing from the input plasmid pool)
    # Barcodes below any of the RNA cutoffs are zero in all replicates
    print(&quot;Removing detection-limited barcodes and normalizing to counts per million.&quot;)
    cutoffs = [10, 5, 5, 5]
    threshold_sample_counts_df = quality_control.filter_low_counts(all_sample_counts_df, sample_labels, cutoffs,
                                                                   dna_labels=dna_labels, bc_per_seq=n_barcodes_per_sequence)
    display(threshold_sample_counts_df.head())

    # Normalize RNA barcode counts by plasmid barcode counts
    print(&quot;Normalizing RNA to DNA.&quot;)
    normalized_sample_counts_df = quality_control.normalize_rna_by_dna(threshold_sample_counts_df, rna_labels, dna_labels)
    # Drop DNA
    barcode_sample_counts_df = normalized_sample_counts_df.drop(columns=dna_labels)
    
    # Average across barcodes
    print(&quot;Averaging across barcodes within a replicate.&quot;)
    activity_replicate_df = quality_control.average_barcodes(barcode_sample_counts_df)
    display(activity_replicate_df.head())
    
    # Basal-normalize, average across replicates, do statistics
    print(&quot;Normalizing to the basal Rho promoter.&quot;)
    sequence_expression_df = quality_control.basal_normalize(activity_replicate_df, &quot;BASAL&quot;)
    print(&quot;Computing p-values for the null hypothesis that a sequence is no different than the basal promoter alone.&quot;)
    sequence_expression_df[&quot;expression_pvalue&quot;] = quality_control.log_ttest_vs_basal(activity_replicate_df, &quot;BASAL&quot;)
    sequence_expression_df[&quot;expression_qvalue&quot;] = modeling.fdr(sequence_expression_df[&quot;expression_pvalue&quot;])
    print(f&quot;Done processing data!&quot;)
    display(sequence_expression_df.head())
    
    rho_activity_data[library] = sequence_expression_df</code></pre>
          <figure slot="outputs">
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Processing data for library1 with the Rho promoter...
Reading in barcode counts.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">DNA</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">barcode</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACAAG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr16-87432635-87432799_CPPQ_scrambled</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3019</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">148</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">325</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">97</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACCGC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr4-119112319-119112483_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">4117</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">24493</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">25950</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">23406</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACGGG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr7-128854234-128854398_UPCE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">86</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">76</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">39</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">233</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTAC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr4-138107597-138107761_UPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">827</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">926</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">857</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">659</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTGT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr5-31298508-31298672_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">7170</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">492</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">392</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">149</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Removing detection-limited barcodes and normalizing to counts per million.
Barcodes missing in DNA:
Sample DNA: 1090 barcodes
1090 barcodes are missing from more than 0 DNA samples.
Barcodes off in RNA:
Sample RNA1: 1744 barcodes
Sample RNA2: 1913 barcodes
Sample RNA3: 1491 barcodes
2215 barcodes are off in more than 0 RNA samples.
There are a total of  157.151 million barcode counts.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">DNA</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">barcode</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACAAG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr16-87432635-87432799_CPPQ_scrambled</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">73.436588</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">4.307406</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">7.418047</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2.561422</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACCGC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr4-119112319-119112483_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">100.145224</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">712.846538</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">592.302519</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">618.068596</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACGGG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr7-128854234-128854398_UPCE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2.091933</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2.211911</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.890166</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">6.152695</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTAC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr4-138107597-138107761_UPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">20.116614</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">26.95039</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">19.560819</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">17.401829</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTGT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr5-31298508-31298672_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">174.408855</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">14.319214</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">8.947306</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3.934556</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Normalizing RNA to DNA.
Averaging across barcodes within a replicate.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">BASAL</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.331679</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.306512</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.277308</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-104768570-104768734_UPCQ_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.005172</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.826315</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.930872</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-104768570-104768734_UPCQ_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.114088</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.080287</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.091619</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106008207-106008371_CPPE_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.180305</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.094909</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.798394</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106008207-106008371_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.441799</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.533383</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.86899</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Normalizing to the basal Rho promoter.
Computing p-values for the null hypothesis that a sequence is no different than the basal promoter alone.
</code></pre>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>/home/ryan/Documents/DBBS/CohenLab/Manuscripts/CRX-Information-Content/utils/quality_control.py:408: RuntimeWarning: invalid value encountered in double_scalars
  cov = std / mean
</code></pre>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Done processing data!
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression_std</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression_reps</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression_pvalue
                  </th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression_qvalue
                  </th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-104768570-104768734_UPCQ_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3.027744</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.330482</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.000139</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.000749</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-104768570-104768734_UPCQ_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3.606621</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.297412</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.001206</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.003548</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106008207-106008371_CPPE_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3.336604</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.396284</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.003039</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.007388</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106008207-106008371_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2.068611</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.944664</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.080583</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.103242</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_scrambled</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.439587</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.579277</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.27973</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.312931</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Processing data for library2 with the Rho promoter...
Reading in barcode counts.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">DNA</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">barcode</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACAAG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr7-141291911-141292075_UPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">132</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACGTT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr19-16380352-16380516_CPPN_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1779</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">36</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">17</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">46</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTAC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-44147572-44147736_UPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2928</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">433</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">802</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">510</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTCG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr12-116230818-116230982_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2822</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3043</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2967</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3013</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTGT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr5-65391346-65391510_CPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1810</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1572</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2281</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1559</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Removing detection-limited barcodes and normalizing to counts per million.
Barcodes missing in DNA:
Sample DNA: 277 barcodes
277 barcodes are missing from more than 0 DNA samples.
Barcodes off in RNA:
Sample RNA1: 875 barcodes
Sample RNA2: 678 barcodes
Sample RNA3: 774 barcodes
1180 barcodes are off in more than 0 RNA samples.
There are a total of  157.724 million barcode counts.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">DNA</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">barcode</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACAAG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr7-141291911-141292075_UPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3.144868</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACGTT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr19-16380352-16380516_CPPN_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">42.384243</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.933407</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.406204</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.301935</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTAC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-44147572-44147736_UPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">69.758888</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">11.226812</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">19.16328</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">14.434499</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTCG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr12-116230818-116230982_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">67.233464</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">78.898818</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">70.894577</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">85.276757</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTGT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr5-65391346-65391510_CPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">43.12281</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">40.758772</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">54.503043</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">44.124283</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Normalizing RNA to DNA.
Averaging across barcodes within a replicate.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">BASAL</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.196778</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.218638</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.236666</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-10229074-10229238_CPPE_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">7.325586</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">5.922791</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">6.286389</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-10229074-10229238_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">6.418129</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">5.188716</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">4.97623</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_MUT-shape</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.282047</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.264416</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.290612</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.260469</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.27625</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.212923</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Normalizing to the basal Rho promoter.
Computing p-values for the null hypothesis that a sequence is no different than the basal promoter alone.
Done processing data!
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression_std</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression_reps</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression_pvalue
                  </th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression_qvalue
                  </th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-10229074-10229238_CPPE_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">30.293101</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">6.01123</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.000003</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.000128</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-10229074-10229238_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">25.791454</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">6.063103</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.000019</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.000167</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_MUT-shape</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.290214</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.124284</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.023905</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.031469</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.162281</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.229405</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.226254</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.246199</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_scrambled</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.995027</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.380942</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.012703</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.018175</span></td>
                </tr>
              </tbody>
            </table><img src="index.html.media/0" alt="" itemscope=""
              itemtype="http://schema.org/ImageObject"><img src="index.html.media/1" alt=""
              itemscope="" itemtype="http://schema.org/ImageObject"><img src="index.html.media/2"
              alt="" itemscope="" itemtype="http://schema.org/ImageObject"><img
              src="index.html.media/3" alt="" itemscope="" itemtype="http://schema.org/ImageObject">
          </figure>
        </stencila-code-chunk>
        <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
          data-execution_count="4" data-programminglanguage="python">
          <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
            slot="text"><code># Now process data for the Polylinker (experiment is in Fig 4, but it is easier to process the data here)
# Process data for the Rho promoter: convert counts into activity scores for each sequence
library_names = [&quot;library1&quot;, &quot;library2&quot;]
polylinker_activity_data = {} # {library name: pd.DataFrame}
barcode_count_dir = os.path.join(data_dir, &quot;Polylinker&quot;)

for library in library_names:
    print(f&quot;Processing data for {library} with the Polylinker...&quot;)
    # File names
    barcode_count_files = [
        os.path.join(barcode_count_dir, f&quot;{library}{sample}.counts&quot;)
        for sample in [&quot;Plasmid&quot;, &quot;Rna1&quot;, &quot;Rna2&quot;, &quot;Rna3&quot;]
    ]
    
    # Masks and metadata for downstream functions
    sample_labels = np.array([&quot;DNA&quot;, &quot;RNA1&quot;, &quot;RNA2&quot;, &quot;RNA3&quot;])
    sample_rna_mask = np.array([False, True, True, True])
    rna_labels = sample_labels[sample_rna_mask]
    dna_labels = sample_labels[np.logical_not(sample_rna_mask)]
    n_samples = len(sample_labels)
    n_rna_samples = len(rna_labels)
    n_dna_samples = len(dna_labels)
    n_barcodes_per_sequence = 3
    
    # Read in the barcode counts
    print(&quot;Reading in barcode counts.&quot;)
    all_sample_counts_df = quality_control.read_bc_count_files(barcode_count_files, sample_labels)
    display(all_sample_counts_df.head())
    
    # Remove barcodes that are detection-limited.
    print(&quot;Removing barcodes missing from the DNA pool and normalizing to counts per million.&quot;)
    cutoffs_dna_only = [50, 0, 0, 0]
    # Barcodes below the DNA cutoff are NaN (because they are missing from the input plasmid pool)
    # Barcodes below any of the RNA cutoffs are zero in all replicates
    print(&quot;Removing detection-limited barcodes and normalizing to counts per million.&quot;)
    threshold_sample_counts_df = quality_control.filter_low_counts(all_sample_counts_df, sample_labels, cutoffs_dna_only,
                                                                   dna_labels=dna_labels, bc_per_seq=n_barcodes_per_sequence)
    print(&quot;Now removing RNA barcodes missing from any replicate.&quot;)
    cutoffs_rna_cpm = [0, 8, 8, 8]
    threshold_sample_counts_df = quality_control.filter_low_counts(threshold_sample_counts_df, sample_labels, cutoffs_rna_cpm,
                                                                  dna_labels=dna_labels, bc_per_seq=n_barcodes_per_sequence, cpm_normalize=False)
    display(threshold_sample_counts_df.head())

    # Normalize RNA barcode counts by plasmid barcode counts
    print(&quot;Normalizing RNA to DNA.&quot;)
    normalized_sample_counts_df = quality_control.normalize_rna_by_dna(threshold_sample_counts_df, rna_labels, dna_labels)
    # Drop DNA
    barcode_sample_counts_df = normalized_sample_counts_df.drop(columns=dna_labels)
    
    # Average across barcodes
    print(&quot;Averaging across barcodes within a replicate.&quot;)
    activity_replicate_df = quality_control.average_barcodes(barcode_sample_counts_df)
    display(activity_replicate_df.head())
    
    # Drop &quot;basal&quot; and average across replicates
    print(&quot;Removing the &#39;basal&#39; promoter (Polylinker) and averaging across replicates. No statistical analysis is performed here.&quot;)
    activity_replicate_df = activity_replicate_df.drop(index=&quot;BASAL&quot;)
    sequence_expression_df = activity_replicate_df.apply(lambda x: pd.Series({&quot;expression&quot;: x.mean(), &quot;expression_SEM&quot;: x.sem()}), axis=1)
    print(f&quot;Done processing data!&quot;)
    display(sequence_expression_df.head())
    
    polylinker_activity_data[library] = sequence_expression_df</code></pre>
          <figure slot="outputs">
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Processing data for library1 with the Polylinker...
Reading in barcode counts.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">DNA</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">barcode</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACAAG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr16-87432635-87432799_CPPQ_scrambled</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">987</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">10</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACCGC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr4-119112319-119112483_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1326</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">4963</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">4554</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">17827</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACGGG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr7-128854234-128854398_UPCE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">35</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTAC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr4-138107597-138107761_UPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">5</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">8</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">6</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">4</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTGT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr5-31298508-31298672_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">5007</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">934</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">993</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">575</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Removing barcodes missing from the DNA pool and normalizing to counts per million.
Removing detection-limited barcodes and normalizing to counts per million.
Barcodes missing in DNA:
Sample DNA: 1722 barcodes
1722 barcodes are missing from more than 0 DNA samples.
Barcodes off in RNA:
Sample RNA1: 0 barcodes
Sample RNA2: 0 barcodes
Sample RNA3: 0 barcodes
0 barcodes are off in more than 0 RNA samples.
There are a total of  92.122 million barcode counts.
Now removing RNA barcodes missing from any replicate.
Barcodes missing in DNA:
Sample DNA: 0 barcodes
0 barcodes are missing from more than 0 DNA samples.
Barcodes off in RNA:
Sample RNA1: 5842 barcodes
Sample RNA2: 11412 barcodes
Sample RNA3: 9805 barcodes
12991 barcodes are off in more than 0 RNA samples.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">DNA</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">barcode</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACAAG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr16-87432635-87432799_CPPQ_scrambled</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">48.214705</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACCGC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr4-119112319-119112483_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">64.774771</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">238.306557</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">198.604223</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">639.087016</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACGGG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr7-128854234-128854398_UPCE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">NaN</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTAC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr4-138107597-138107761_UPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">NaN</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTGT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr5-31298508-31298672_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">244.590708</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">44.847537</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">43.305664</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">20.613397</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Normalizing RNA to DNA.
Averaging across barcodes within a replicate.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">BASAL</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.742818</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.983263</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.267636</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-104768570-104768734_UPCQ_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-104768570-104768734_UPCQ_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106008207-106008371_CPPE_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106008207-106008371_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Removing the &#39;basal&#39; promoter (Polylinker) and averaging across replicates. No statistical analysis is performed here.
Done processing data!
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression_SEM</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-104768570-104768734_UPCQ_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-104768570-104768734_UPCQ_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106008207-106008371_CPPE_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106008207-106008371_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_scrambled</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Processing data for library2 with the Polylinker...
Reading in barcode counts.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">DNA</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">barcode</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACAAG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr7-141291911-141292075_UPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">20</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">15</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">21</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACGTT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr19-16380352-16380516_CPPN_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">990</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">10</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">9</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">10</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTAC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-44147572-44147736_UPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1056</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">4</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTCG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr12-116230818-116230982_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">7</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">4</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">6</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTGT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr5-65391346-65391510_CPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1653</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1441</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">9</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">4695</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Removing barcodes missing from the DNA pool and normalizing to counts per million.
Removing detection-limited barcodes and normalizing to counts per million.
Barcodes missing in DNA:
Sample DNA: 2107 barcodes
2107 barcodes are missing from more than 0 DNA samples.
Barcodes off in RNA:
Sample RNA1: 0 barcodes
Sample RNA2: 0 barcodes
Sample RNA3: 0 barcodes
0 barcodes are off in more than 0 RNA samples.
There are a total of  89.662 million barcode counts.
Now removing RNA barcodes missing from any replicate.
Barcodes missing in DNA:
Sample DNA: 0 barcodes
0 barcodes are missing from more than 0 DNA samples.
Barcodes off in RNA:
Sample RNA1: 12647 barcodes
Sample RNA2: 12055 barcodes
Sample RNA3: 10999 barcodes
13873 barcodes are off in more than 0 RNA samples.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">DNA</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">barcode</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACAAG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr7-141291911-141292075_UPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">NaN</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACGTT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr19-16380352-16380516_CPPN_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">38.377926</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTAC</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-44147572-44147736_UPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">40.936454</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTCG</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr12-116230818-116230982_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">NaN</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">AACAACTGT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr5-65391346-65391510_CPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">64.079506</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Normalizing RNA to DNA.
Averaging across barcodes within a replicate.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA2</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RNA3</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">BASAL</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-10229074-10229238_CPPE_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.486824</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.405204</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.305344</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-10229074-10229238_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_MUT-shape</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
              </tbody>
            </table>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Removing the &#39;basal&#39; promoter (Polylinker) and averaging across replicates. No statistical analysis is performed here.
Done processing data!
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">expression_SEM</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-10229074-10229238_CPPE_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.06579</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.334422</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-10229074-10229238_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_MUT-shape</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-106171416-106171580_CSPE_scrambled</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0</span></td>
                </tr>
              </tbody>
            </table><img src="index.html.media/4" alt="" itemscope=""
              itemtype="http://schema.org/ImageObject"><img src="index.html.media/5" alt=""
              itemscope="" itemtype="http://schema.org/ImageObject"><img src="index.html.media/6"
              alt="" itemscope="" itemtype="http://schema.org/ImageObject"><img
              src="index.html.media/7" alt="" itemscope="" itemtype="http://schema.org/ImageObject">
          </figure>
        </stencila-code-chunk>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig1s1"
          title="Figure 1—figure supplement 1."><label data-itemprop="label">Figure 1—figure
            supplement 1.</label>
          <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
            data-execution_count="5" data-programminglanguage="python">
            <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
              slot="text"><code># File names of the raw barcode counts
raw_data_files = [os.path.join(data_dir, dirname, filename) for dirname, filename in itertools.product([&quot;Rhodopsin&quot;, &quot;Polylinker&quot;], [&quot;library1RawBarcodeCounts.txt&quot;, &quot;library2RawBarcodeCounts.txt&quot;])]
raw_data_names = [&quot;Library 1\n+Rho&quot;, &quot;Library 2\n+Rho&quot;, &quot;Library 1\n+Polylinker&quot;, &quot;Library 2\n+Polylinker&quot;]
comparison_columns = [&quot;Rep 1 vs 2&quot;, &quot;Rep 1 vs 3&quot;, &quot;Rep 2 vs 3&quot;]
fig, ax_list = plt.subplots(nrows=4, ncols=3, figsize=(8, 8))

# Read in each dataset
for row, filename in enumerate(raw_data_files):
    row_df = pd.read_csv(filename, sep=&quot;\t&quot;)
    # Get all 3 pairs of combinations and plot them
    for col, (x, y) in enumerate(itertools.combinations([&quot;RNA1&quot;, &quot;RNA2&quot;, &quot;RNA3&quot;], 2)):
        rsquared = stats.pearsonr(row_df[x], row_df[y])[0] ** 2
        ax = ax_list[row, col]
        ax.scatter(row_df[x] / 1000, row_df[y] / 1000, color=&quot;k&quot;)
        ax.text(0.02, 0.98, fr&quot;$r^2$={rsquared:.2f}&quot;, transform=ax.transAxes, ha=&quot;left&quot;, va=&quot;top&quot;)
        max_value = max(ax.get_xlim()[1], ax.get_ylim()[1])
        ax.set_xlim(right=max_value)
        ax.set_ylim(top=max_value)
        
# Add &quot;axis&quot; labels
fig.text(0.5, 0.025, &quot;Raw barcode counts (thousands)&quot;, ha=&quot;center&quot;, va=&quot;top&quot;, fontsize=14)
fig.text(0.025, 0.5, &quot;Raw barcode counts (thousands)&quot;, rotation=90, ha=&quot;right&quot;, va=&quot;center&quot;, fontsize=14)

# Add column labels at the top
for col, text in enumerate(comparison_columns):
    ax_list[0, col].set_title(text)
    
# Add row labels on the right
for row, text in enumerate(raw_data_names):
    twinax = ax_list[row, 2].twinx()
    twinax.set_ylabel(text)
    twinax.set_yticks([])
    
display(fig)
plt.close()</code></pre>
            <figure slot="outputs"><img src="index.html.media/8" alt="" itemscope=""
                itemtype="http://schema.org/ImageObject"></figure>
          </stencila-code-chunk>
          <figcaption>
            <h4 itemscope="" itemtype="http://schema.stenci.la/Heading"
              id="reproducibility-of-massively-parallel-reporter-assay-mpra-measurements">
              Reproducibility of massively parallel reporter assay (MPRA) measurements.</h4>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Each row represents a
              different library and experiment. For each column, the first replicate in the title is
              the x-axis and the second replicate is the y-axis.</p>
          </figcaption>
        </figure>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig1s2"
          title="Figure 1—figure supplement 2."><label data-itemprop="label">Figure 1—figure
            supplement 2.</label>
          <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
            data-execution_count="6" data-programminglanguage="python">
            <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
              slot="text"><code>library1_rho_df = rho_activity_data[&quot;library1&quot;]
library1_rho_df[&quot;library&quot;] = 1
library2_rho_df = rho_activity_data[&quot;library2&quot;]
library2_rho_df[&quot;library&quot;] = 2

# Get scrambled sequences from each library with RNA barcodes measured
scrambled_library1_df = library1_rho_df[library1_rho_df.index.str.contains(&quot;scrambled&quot;) &amp; (library1_rho_df[&quot;expression&quot;] &gt; 0)]
scrambled_library2_df = library2_rho_df[library2_rho_df.index.str.contains(&quot;scrambled&quot;) &amp; (library2_rho_df[&quot;expression&quot;] &gt; 0)]

# Compare distributions of log2 expression
scrambled_library1_expr = np.log2(scrambled_library1_df[&quot;expression&quot;])
scrambled_library2_expr = np.log2(scrambled_library2_df[&quot;expression&quot;])
ks_stat, pval = stats.ks_2samp(scrambled_library1_expr, scrambled_library2_expr)
print(f&quot;Scrambled sequences from L1 and L2 are drawn from the same distribution, KS test p = {pval:.3f}, D = {ks_stat:.2f}&quot;)

# Show the two histograms
fig, ax = plt.subplots()
ax.hist([scrambled_library2_expr, scrambled_library1_expr], bins=&quot;auto&quot;, histtype=&quot;stepfilled&quot;, density=True, label=[&quot;library 2&quot;, &quot;library 1&quot;], color=plot_utils.set_color([0.75, 0.25]), alpha=0.5)
ax.set_xlabel(&quot;log2 Scrambled Activity/Rho&quot;)
ax.set_ylabel(&quot;Density&quot;)
ax.legend(loc=&quot;upper left&quot;, frameon=False)
display(fig)
plt.close()</code></pre>
            <figure slot="outputs">
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Scrambled sequences from L1 and L2 are drawn from the same distribution, KS test p = 0.087, D = 0.14
</code></pre><img src="index.html.media/9" alt="" itemscope=""
                itemtype="http://schema.org/ImageObject">
            </figure>
          </stencila-code-chunk>
          <figcaption>
            <h4 itemscope="" itemtype="http://schema.stenci.la/Heading"
              id="calibration-of-massively-parallel-reporter-assay-mpra-libraries-with-the-rho-promoter">
              Calibration of massively parallel reporter assay (MPRA) libraries with the <em
                itemscope="" itemtype="http://schema.stenci.la/Emphasis">Rho</em> promoter.</h4>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Probability density
              histogram of the same 150 scrambled sequences in two libraries after normalizing to
              the basal <em itemscope="" itemtype="http://schema.stenci.la/Emphasis">Rho</em>
              promoter.</p>
          </figcaption>
        </figure>
        <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
          data-execution_count="7" data-programminglanguage="python">
          <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
            slot="text"><code># Join and annotate all data
print(&quot;Joining together data from the two libraries with the Rho promoter.&quot;)
color_mapping = {
    &quot;Strong enhancer&quot;: &quot;#1f78b4&quot;,
    &quot;Weak enhancer&quot;: &quot;#a6cee3&quot;,
    &quot;Inactive&quot;: &quot;#33a02c&quot;,
    &quot;Silencer&quot;: &quot;#e31a1c&quot;,
    np.nan: &quot;grey&quot;
}

# Join the libraries and add a pseudocount to take log2
rho_df = library1_rho_df.append(library2_rho_df)
rho_pseudocount = 1e-3
rho_df[&quot;expression_log2&quot;] = np.log2(rho_df[&quot;expression&quot;] + rho_pseudocount)

# Define cutoff for a strong enhancer based on scrambled sequences
print(&quot;Annotating sequences as strong enhancer, weak enhancer, inactive, silencer, or ambiguous.&quot;)
scrambled_mask = rho_df.index.str.contains(&quot;scrambled&quot;)
scrambled_df = rho_df[scrambled_mask]
scrambled_df = scrambled_df[scrambled_df[&quot;expression&quot;].notna()]
strong_cutoff = scrambled_df[&quot;expression_log2&quot;].quantile(0.95)
print(f&quot;Cutoff to call something a strong enhancer: activity is above {strong_cutoff:.2f}&quot;)

# Drop scrambled sequences
rho_df = rho_df[~scrambled_mask]

# Helper function to label and color a sequence
def label_color_sequence(row, alpha=0.05, strong_cutoff=strong_cutoff, inactive_cutoff=1, color_mapping=color_mapping):
    expr_log2 = row[&quot;expression_log2&quot;]
    qval = row[&quot;expression_qvalue&quot;]
    # Inactive
    if (np.abs(expr_log2) &lt;= inactive_cutoff) &amp; (qval &gt;= alpha):
        group = &quot;Inactive&quot;
    # Silencer
    elif (expr_log2 &lt; -inactive_cutoff) &amp; ((qval &lt; alpha) | (row[&quot;expression&quot;] == 0)):
        group = &quot;Silencer&quot;
    # Enhancer
    elif (expr_log2 &gt; inactive_cutoff) &amp; (qval &lt; alpha):
        # Strong
        if expr_log2 &gt; strong_cutoff:
            group = &quot;Strong enhancer&quot;
        # Weak
        else:
            group = &quot;Weak enhancer&quot;
    # Ambiguous
    else:
        group = np.nan
    
    color = color_mapping[group]
    return pd.Series({&quot;group_name&quot;: group, &quot;plot_color&quot;: color})

# Annotate both WT and MUT sequences
rho_df = rho_df.join(rho_df.apply(label_color_sequence, axis=1))
rho_df[&quot;group_name&quot;] = sequence_annotation_processing.to_categorical(rho_df[&quot;group_name&quot;])

# Now do Polylinker data
library1_poly_df = polylinker_activity_data[&quot;library1&quot;]
library2_poly_df = polylinker_activity_data[&quot;library2&quot;]
print(&quot;Joining together data from the two libraries with the Polylinker promoter and annotate for autonomous activity.&quot;)
poly_df = library1_poly_df.append(library2_poly_df)
poly_pseudocount = 1e-2
poly_df[&quot;expression_log2&quot;] = np.log2(poly_df[&quot;expression&quot;] + poly_pseudocount)
poly_df[&quot;autonomous_activity&quot;] = (poly_df[&quot;expression_log2&quot;] &gt; 0)

# Compute effect of mutating CRX motifs in the presence of the Rho promoter.
print(&quot;Computing the effect size upon mutating CRX motifs in the presence of the Rho promoter.&quot;)
print(&quot;This is for Figure 5, but it is easier to do it here.&quot;)
wt_mask = rho_df.index.str.contains(&quot;_WT$&quot;)
mut_mask = rho_df.index.str.contains(&quot;_MUT-allCrxSites$&quot;)

# Add variant info as a column, then trim it off the index
rho_df_no_variant_df = rho_df.copy()
rho_df_no_variant_df[&quot;variant&quot;] = rho_df_no_variant_df.index.str.split(&quot;_&quot;).str[2:].str.join(&quot;_&quot;)
rho_df_no_variant_df = sequence_annotation_processing.remove_mutations_from_seq_id(rho_df_no_variant_df)

# Separate out WT and MUT, then join them together on the same row
wt_df = rho_df_no_variant_df[wt_mask]
mut_df = rho_df_no_variant_df[mut_mask]
wt_vs_mut_rho_df = wt_df.join(mut_df, lsuffix=&quot;_WT&quot;, rsuffix=&quot;_MUT&quot;)
wt_vs_mut_rho_df[&quot;wt_vs_mut_log2&quot;] = wt_vs_mut_rho_df[&quot;expression_log2_WT&quot;] - wt_vs_mut_rho_df[&quot;expression_log2_MUT&quot;]

# Compute parameters for lognormal distribution to do stats
wt_cov = wt_vs_mut_rho_df[&quot;expression_std_WT&quot;] / wt_vs_mut_rho_df[&quot;expression_WT&quot;]
wt_log_mean = np.log(wt_vs_mut_rho_df[&quot;expression_WT&quot;] / np.sqrt(wt_cov**2 + 1))
wt_log_std = np.sqrt(np.log(wt_cov**2 + 1))
mut_cov = wt_vs_mut_rho_df[&quot;expression_std_MUT&quot;] / wt_vs_mut_rho_df[&quot;expression_MUT&quot;]
mut_log_mean = np.log(wt_vs_mut_rho_df[&quot;expression_MUT&quot;] / np.sqrt(mut_cov**2 + 1))
mut_log_std = np.sqrt(np.log(mut_cov**2 + 1))

# Do t-tests and FDR
wt_vs_mut_rho_df[&quot;wt_vs_mut_pvalue&quot;] = stats.ttest_ind_from_stats(wt_log_mean, wt_log_std, wt_vs_mut_rho_df[&quot;expression_reps_WT&quot;], mut_log_mean, mut_log_std, wt_vs_mut_rho_df[&quot;expression_reps_MUT&quot;], equal_var=False)[1]
wt_vs_mut_rho_df[&quot;wt_vs_mut_qvalue&quot;] = modeling.fdr(wt_vs_mut_rho_df[&quot;wt_vs_mut_pvalue&quot;])

# Pull out WT polylinker measurements
print(&quot;Joining Rho and Polylinker data together.&quot;)
poly_wt_df = poly_df.copy()
poly_wt_df = poly_wt_df[poly_wt_df.index.str.contains(&quot;WT&quot;)]

# Drop the variant ID
poly_wt_df = poly_wt_df.rename(index=lambda x: x[:-3], columns={&quot;expression&quot;: &quot;expression_POLY&quot;, &quot;expression_SEM&quot;: &quot;expression_SEM_POLY&quot;, &quot;expression_log2&quot;: &quot;expression_log2_POLY&quot;})

# Join with Rho
activity_df = wt_vs_mut_rho_df.join(poly_wt_df)

print(&quot;Annotating sequences for binding patterns.&quot;)
# Get info on CRX binding from the seq ID strings
activity_df[&quot;crx_bound&quot;] = activity_df.index.str.contains(&quot;_C...$&quot;)

# Read in BED files
library_bed = BedTool(os.path.join(data_dir, &quot;library1And2.bed&quot;))
nrl_chip_bed = BedTool(os.path.join(&quot;Data&quot;, &quot;Downloaded&quot;, &quot;ChIP&quot;, &quot;nrlPeaksMm10.bed&quot;))
mef2d_chip_bed = BedTool(os.path.join(&quot;Data&quot;, &quot;Downloaded&quot;, &quot;ChIP&quot;, &quot;mef2dPeaksMm10.bed&quot;))

# Get binding patterns for NRL and MEF2D
library_nrl_bound_df = library_bed.intersect(nrl_chip_bed, wa=True).to_dataframe()
activity_df[&quot;nrl_bound&quot;] = activity_df.index.isin(library_nrl_bound_df[&quot;name&quot;])

library_mef2d_bound_df = library_bed.intersect(mef2d_chip_bed, wa=True).to_dataframe()
activity_df[&quot;mef2d_bound&quot;] = activity_df.index.isin(library_mef2d_bound_df[&quot;name&quot;])

# Helper function to &quot;reverse one hot encode&quot; binding patterns
def annotate_binding(row):
    crx, nrl, mef2d = row[[&quot;crx_bound&quot;, &quot;nrl_bound&quot;, &quot;mef2d_bound&quot;]]
    if crx:
        if nrl:
            if mef2d:
                return &quot;All three&quot;
            else:
                return &quot;CRX+NRL&quot;
        elif mef2d:
            return &quot;CRX+MEF2D&quot;
        else:
            return &quot;CRX only&quot;
    elif nrl:
        if mef2d:
            return &quot;NRL+MEF2D&quot;
        else:
            return &quot;NRL only&quot;
    elif mef2d:
        return &quot;MEF2D only&quot;
    else:
        return &quot;No binding&quot;

activity_df[&quot;binding_group&quot;] = activity_df.apply(annotate_binding, axis=1)
print(&quot;Done processing and annotating data. This table corresponds to Supplementary file 3.&quot;)
display(activity_df.head())</code></pre>
          <figure slot="outputs">
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Joining together data from the two libraries with the Rho promoter.
Annotating sequences as strong enhancer, weak enhancer, inactive, silencer, or ambiguous.
Cutoff to call something a strong enhancer: activity is above 2.84
Joining together data from the two libraries with the Polylinker promoter and annotate for autonomous activity.
Computing the effect size upon mutating CRX motifs in the presence of the Rho promoter.
This is for Figure 5, but it is easier to do it here.
Joining Rho and Polylinker data together.
Annotating sequences for binding patterns.
</code></pre>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>/home/ryan/miniconda/envs/bclab/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in greater
  return (self.a &lt; x) &amp; (x &lt; self.b)
/home/ryan/miniconda/envs/bclab/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in less
  return (self.a &lt; x) &amp; (x &lt; self.b)
/home/ryan/miniconda/envs/bclab/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:1821: RuntimeWarning: invalid value encountered in less_equal
  cond2 = cond0 &amp; (x &lt;= self.a)
</code></pre>
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Done processing and annotating data. This table corresponds to Supplementary file 3.
</code></pre><span itemscope=""
              itemtype="http://schema.stenci.la/Array">[{rows:[{rowType:'header',cells:[{content:[],type:'TableCell'},{content:['expression_WT'],type:'TableCell'},{content:['expression_std_WT'],type:'TableCell'},{content:['expression_reps_WT'],type:'TableCell'},{content:['expression_pvalue_WT'],type:'TableCell'},{content:['expression_qvalue_WT'],type:'TableCell'},{content:['library_WT'],type:'TableCell'},{content:['expression_log2_WT'],type:'TableCell'},{content:['group_name_WT'],type:'TableCell'},{content:['plot_color_WT'],type:'TableCell'},{content:['variant_WT'],type:'TableCell'},{content:['...'],type:'TableCell'},{content:['wt_vs_mut_pvalue'],type:'TableCell'},{content:['wt_vs_mut_qvalue'],type:'TableCell'},{content:['expression_POLY'],type:'TableCell'},{content:['expression_SEM_POLY'],type:'TableCell'},{content:['expression_log2_POLY'],type:'TableCell'},{content:['autonomous_activity'],type:'TableCell'},{content:['crx_bound'],type:'TableCell'},{content:['nrl_bound'],type:'TableCell'},{content:['mef2d_bound'],type:'TableCell'},{content:['binding_group'],type:'TableCell'}],type:'TableRow'},{rowType:'header',cells:[{content:['label'],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'},{content:[],type:'TableCell'}],type:'TableRow'},{cells:[{content:['chr1-104768570-104768734_UPCQ'],type:'TableCell'},{content:['3.606621'],type:'TableCell'},{content:['0.297412'],type:'TableCell'},{content:['3.0'],type:'TableCell'},{content:['0.001206'],type:'TableCell'},{content:['0.003548'],type:'TableCell'},{content:['1'],type:'TableCell'},{content:['1.851048'],type:'TableCell'},{content:['Weak
              enhancer'],type:'TableCell'},{content:['#a6cee3'],type:'TableCell'},{content:['WT'],type:'TableCell'},{content:['...'],type:'TableCell'},{content:['0.092328'],type:'TableCell'},{content:['0.147455'],type:'TableCell'},{content:['0.000000'],type:'TableCell'},{content:['0.000000'],type:'TableCell'},{content:['-6.643856'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['No
              binding'],type:'TableCell'}],type:'TableRow'},{cells:[{content:['chr1-106008207-106008371_CPPE'],type:'TableCell'},{content:['2.068611'],type:'TableCell'},{content:['0.944664'],type:'TableCell'},{content:['3.0'],type:'TableCell'},{content:['0.080583'],type:'TableCell'},{content:['0.103242'],type:'TableCell'},{content:['1'],type:'TableCell'},{content:['1.049360'],type:'TableCell'},{content:['NaN'],type:'TableCell'},{content:['grey'],type:'TableCell'},{content:['WT'],type:'TableCell'},{content:['...'],type:'TableCell'},{content:['0.145377'],type:'TableCell'},{content:['0.212937'],type:'TableCell'},{content:['0.000000'],type:'TableCell'},{content:['0.000000'],type:'TableCell'},{content:['-6.643856'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['True'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['CRX
              only'],type:'TableCell'}],type:'TableRow'},{cells:[{content:['chr1-106696554-106696718_CPPE'],type:'TableCell'},{content:['8.261201'],type:'TableCell'},{content:['1.317719'],type:'TableCell'},{content:['3.0'],type:'TableCell'},{content:['0.000008'],type:'TableCell'},{content:['0.000217'],type:'TableCell'},{content:['1'],type:'TableCell'},{content:['3.046526'],type:'TableCell'},{content:['Strong
              enhancer'],type:'TableCell'},{content:['#1f78b4'],type:'TableCell'},{content:['WT'],type:'TableCell'},{content:['...'],type:'TableCell'},{content:['0.003104'],type:'TableCell'},{content:['0.013211'],type:'TableCell'},{content:['0.795621'],type:'TableCell'},{content:['0.058574'],type:'TableCell'},{content:['-0.311827'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['True'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['CRX
              only'],type:'TableCell'}],type:'TableRow'},{cells:[{content:['chr1-118321635-118321799_CPPP'],type:'TableCell'},{content:['1.368148'],type:'TableCell'},{content:['0.397835'],type:'TableCell'},{content:['3.0'],type:'TableCell'},{content:['0.166861'],type:'TableCell'},{content:['0.196017'],type:'TableCell'},{content:['1'],type:'TableCell'},{content:['0.453279'],type:'TableCell'},{content:['Inactive'],type:'TableCell'},{content:['#33a02c'],type:'TableCell'},{content:['WT'],type:'TableCell'},{content:['...'],type:'TableCell'},{content:['0.080966'],type:'TableCell'},{content:['0.132766'],type:'TableCell'},{content:['0.000000'],type:'TableCell'},{content:['0.000000'],type:'TableCell'},{content:['-6.643856'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['True'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['CRX
              only'],type:'TableCell'}],type:'TableRow'},{cells:[{content:['chr1-118589610-118589774_UPCE'],type:'TableCell'},{content:['0.184993'],type:'TableCell'},{content:['0.077742'],type:'TableCell'},{content:['3.0'],type:'TableCell'},{content:['0.019478'],type:'TableCell'},{content:['0.031968'],type:'TableCell'},{content:['1'],type:'TableCell'},{content:['-2.426678'],type:'TableCell'},{content:['Silencer'],type:'TableCell'},{content:['#e31a1c'],type:'TableCell'},{content:['WT'],type:'TableCell'},{content:['...'],type:'TableCell'},{content:['0.005790'],type:'TableCell'},{content:['0.019789'],type:'TableCell'},{content:['0.308888'],type:'TableCell'},{content:['0.138871'],type:'TableCell'},{content:['-1.648877'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['False'],type:'TableCell'},{content:['No
              binding'],type:'TableCell'}],type:'TableRow'}],type:'Table'},{content:['5 rows × 31
              columns'],type:'Paragraph'}]</span>
          </figure>
        </stencila-code-chunk>
        <h3 itemscope="" itemtype="http://schema.stenci.la/Heading"
          id="strong-enhancers-and-silencers-have-high-crx-motif-content">Strong enhancers and
          silencers have high CRX motif content</h3>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">The <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory activities of
          CRX-targeted sequences vary widely (<a href="#fig1" itemscope=""
            itemtype="http://schema.stenci.la/Link">Figure 1a</a>). We defined enhancers and
          silencers as those sequences that have statistically significant activity that is at least
          twofold above or below the activity of the basal <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">Rho</em> promoter (Welch’s t-test,
          Benjamini-Hochberg false discovery rate (FDR) q &lt; 0.05, <a href="#supp3" itemscope=""
            itemtype="http://schema.stenci.la/Link">Supplementary file 3</a>). We defined inactive
          sequences as those whose activity is both within a twofold change of basal activity and
          not significantly different from the basal <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">Rho</em> promoter. We further stratified
          enhancers into strong and weak enhancers based on whether or not they fell above the 95th
          percentile of scrambled sequences. Using these criteria, 22% of CRX-targeted sequences are
          strong enhancers, 28% are weak enhancers, 19% are inactive, and 17% are silencers; the
          remaining 13% were considered ambiguous and removed from further analysis. To test whether
          these sequences function as CRX-dependent enhancers and silencers in the genome, we
          examined genes differentially expressed in <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">Crx<sup itemscope=""
              itemtype="http://schema.stenci.la/Superscript">-/-</sup></em> retina <cite
            itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib71"><span>71</span><span>Roger et al.</span><span>2014</span></a></cite>.
          Genes that are de-repressed are more likely to be near silencers (Fisher’s exact test p =
          0.001, odds ratio = 2.1, n = 206) and genes that are down-regulated are more likely to be
          near enhancers (Fisher’s exact test p = 0.02, odds ratio = 1.5, n = 344, Materials and
          methods), suggesting that our reporter assay identified sequences that act as enhancers
          and silencers in the genome. We sought to identify features that would accurately classify
          these different classes of sequences.</p>
        <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
          data-execution_count="8" data-programminglanguage="python">
          <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
            slot="text"><code># Calculate predicted occupancy of all TFs
print(&quot;Computing predicted occupancy of 8 TFs on every WT and mutant sequence. This might take 2-3 minutes.&quot;)

# Load in PWMs
pwms = predicted_occupancy.read_pwm_files(os.path.join(&quot;Data&quot;, &quot;Downloaded&quot;, &quot;Pwm&quot;, &quot;photoreceptorAndEnrichedMotifs.meme&quot;))
pwms = pwms.rename(lambda x: x.split(&quot;_&quot;)[0])
# Reverse compliment RAX for display purposes
rax = pwms[&quot;RAX&quot;].copy()
rax = rax[::-1].reset_index(drop=True)
rax_rc = rax.copy()
rax_rc[&quot;A&quot;] = rax[&quot;T&quot;]
rax_rc[&quot;C&quot;] = rax[&quot;G&quot;]
rax_rc[&quot;G&quot;] = rax[&quot;C&quot;]
rax_rc[&quot;T&quot;] = rax[&quot;A&quot;]
pwms[&quot;RAX&quot;] = rax_rc
motif_len = pwms.apply(len)
ewms = pwms.apply(predicted_occupancy.ewm_from_letter_prob).apply(predicted_occupancy.ewm_to_dict)
mu = 9

# Do predicted occupancy scans
occupancy_df = predicted_occupancy.all_seq_total_occupancy(all_seqs, ewms, mu, convert_ewm=False)
print(&quot;Done computing predicted occupancies. This corresponds to Supplementary table 4.&quot;)
display(occupancy_df.head())

# Separate out the WT sequences
wt_occupancy_df = occupancy_df[occupancy_df.index.str.contains(&quot;WT$&quot;)].copy()
wt_occupancy_df = sequence_annotation_processing.remove_mutations_from_seq_id(wt_occupancy_df)
wt_occupancy_df = wt_occupancy_df.loc[activity_df.index]
n_tfs = len(wt_occupancy_df.columns)</code></pre>
          <figure slot="outputs">
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Computing predicted occupancy of 8 TFs on every WT and mutant sequence. This might take 2-3 minutes.
Done computing predicted occupancies. This corresponds to Supplementary table 4.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">CRX</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">GFI1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">MAZ</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">MEF2D</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">NDF1</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">NRL</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RORB</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RAX</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-4357766-4357930_CPPP_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2.297972</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.187172</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2.204502e-8</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.000001421229</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3.064604e-7</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.001505</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.02370847</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.005755</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-4357766-4357930_CPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.239708</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3.783122e-11</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2.204502e-8</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.000001421229</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3.064606e-7</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.411916</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.02340304</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.004416</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-73826292-73826456_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2.290427</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.00639738</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.005577725</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.815852e-9</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">6.713635e-7</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.993418</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.0002922269</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.000004</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-73826292-73826456_CPPE_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.29341</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.20373e-8</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.005577725</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">6.339047e-11</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">6.713632e-7</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.993414</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.23963e-7</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.000002</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr11-87108697-87108861_CPPP_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2.71847</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.6025624</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2.74423e-12</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.000002986062</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">6.477337e-7</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.040965</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.00004672926</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.190641</span></td>
                </tr>
              </tbody>
            </table>
          </figure>
        </stencila-code-chunk>
        <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
          data-execution_count="9" data-programminglanguage="python">
          <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
            slot="text"><code>print(&quot;Computing information content of sequences.&quot;)
entropy_df = occupancy_df.apply(predicted_occupancy.boltzmann_entropy, axis=1)
print(&quot;Done computing information content and related metrics. This corresponds to Supplementary table 5.&quot;)
display(entropy_df.head())

wt_entropy_df = entropy_df[entropy_df.index.str.contains(&quot;WT$&quot;)].copy()
wt_entropy_df = sequence_annotation_processing.remove_mutations_from_seq_id(wt_entropy_df)
wt_entropy_df = wt_entropy_df.loc[activity_df.index]

mut_entropy_df = entropy_df[entropy_df.index.str.contains(&quot;MUT&quot;)].copy()
mut_entropy_df = sequence_annotation_processing.remove_mutations_from_seq_id(mut_entropy_df)
mut_entropy_df = mut_entropy_df.loc[activity_df.index]</code></pre>
          <figure slot="outputs">
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Computing information content of sequences.
Done computing information content and related metrics. This corresponds to Supplementary table 5.
</code></pre>
            <table itemscope="" itemtype="http://schema.org/Table">
              <thead>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">total_occupancy</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">diversity</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">entropy</th>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                </tr>
              </thead>
              <tbody>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-4357766-4357930_CPPP_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3.516114</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2.291861</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-4357766-4357930_CPPP_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.679445</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.440493</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-73826292-73826456_CPPE_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3.296117</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.74337</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr1-73826292-73826456_CPPE_MUT-allCrxSites</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.292404</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">0.378922</span></td>
                </tr>
                <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                    chr11-87108697-87108861_CPPP_WT</td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">3.552689</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">2</span></td>
                  <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                      data-itemtype="http://schema.org/Number">1.867968</span></td>
                </tr>
              </tbody>
            </table>
          </figure>
        </stencila-code-chunk>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig1" title="Figure 1.">
          <label data-itemprop="label">Figure 1.</label>
          <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
            data-execution_count="10" data-programminglanguage="python">
            <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
              slot="text"><code># Mapping activity class to a color
color_mapping = {
    &quot;Silencer&quot;: &quot;#e31a1c&quot;,
    &quot;Inactive&quot;: &quot;#33a02c&quot;,
    &quot;Weak enhancer&quot;: &quot;#a6cee3&quot;,
    &quot;Strong enhancer&quot;: &quot;#1f78b4&quot;,
    np.nan: &quot;grey&quot;
}
color_mapping = pd.Series(color_mapping)

# Sort order for the four activity bins
class_sort_order = [&quot;Silencer&quot;, &quot;Inactive&quot;, &quot;Weak enhancer&quot;, &quot;Strong enhancer&quot;]
activity_df[&quot;group_name_WT&quot;] = sequence_annotation_processing.to_categorical(activity_df[&quot;group_name_WT&quot;])
activity_df[&quot;group_name_MUT&quot;] = sequence_annotation_processing.to_categorical(activity_df[&quot;group_name_MUT&quot;])
rho_ticks = np.arange(-10, 7, 2)

# We can only plot points that were detected in DNA
activity_measured_wt_df = activity_df[activity_df[&quot;expression_log2_WT&quot;].notna()]
print(&quot;Frequency of each activity bin in WT sequences:&quot;)
display(activity_measured_wt_df[&quot;group_name_WT&quot;].value_counts(normalize=True, dropna=False, sort=False))

# Count frequency of activity bins for CRX bound/unbound
crx_bound_grouper = activity_df.groupby(&quot;crx_bound&quot;)
chip_activity_bin_freqs = crx_bound_grouper[&quot;group_name_WT&quot;].value_counts().unstack()
chip_activity_bin_freqs = chip_activity_bin_freqs[class_sort_order].rename(index=lambda x: &quot;ChIP-seq&quot; if x else &quot;ATAC-seq&quot;)

# Different ways to format group names
chip_group_names_with_n = [f&quot;{i}\nn={j.sum()}&quot; for i, j in chip_activity_bin_freqs.iterrows()]
chip_group_names_with_n_oneline = [&quot; &quot;.join(i.split()) for i in chip_group_names_with_n]
chip_group_names = chip_activity_bin_freqs.index.values
chip_group_count = [j.sum() for i, j in chip_activity_bin_freqs.iterrows()]

# Display the data behind Fig 1b
print(&quot;Frequency of activity bins vs. CRX binding status:&quot;)
display(chip_activity_bin_freqs)

# Test if CRX binding and inactive status is independent
chip_group_inactive_counts = crx_bound_grouper[&quot;group_name_WT&quot;].apply(lambda x: (x == &quot;Inactive&quot;).value_counts()).unstack()
oddsratio, pval = stats.fisher_exact(chip_group_inactive_counts)
# Take inverse of odds ratio to match language of manuscript and be more intuitive to the reader
print(f&quot;ChIP-seq status is independent of if a sequence is inactive, Fisher&#39;s exact test p={pval:.0e}, odds ratio={1/oddsratio:.2f}&quot;)

# Same for strong enhancer
chip_group_inactive_counts = crx_bound_grouper[&quot;group_name_WT&quot;].apply(lambda x: (x == &quot;Strong enhancer&quot;).value_counts()).unstack()
oddsratio, pval = stats.fisher_exact(chip_group_inactive_counts)
# Take inverse of odds ratio to match language of manuscript and be more intuitive to the reader
print(f&quot;ChIP-seq status is independent of if a sequence is inactive, Fisher&#39;s exact test p={pval:.0e}, odds ratio={oddsratio:.2f}&quot;)

# Row-normalize the counts
chip_activity_bin_freqs = chip_activity_bin_freqs.div(chip_activity_bin_freqs.sum(axis=1), axis=0)
display(chip_activity_bin_freqs)

# Setup for some downstream stuff
wt_activity_grouper = activity_df.groupby(&quot;group_name_WT&quot;)
wt_activity_names_oneline = [&quot;Silencer&quot;, &quot;Inactive&quot;, &quot;Weak enh.&quot;, &quot;Strong enh.&quot;]
wt_activity_count = [len(j) for i, j in wt_activity_grouper]

# Predicted CRX occupancy vs. WT group
wt_occupancy_grouper = wt_occupancy_df.groupby(activity_df[&quot;group_name_WT&quot;])
wt_occupancy_grouper_crx = wt_occupancy_grouper[&quot;CRX&quot;]
print(&quot;Predicted CRX occupancies:&quot;)
display(wt_occupancy_grouper_crx.describe())

# Statistics for differences in CRX occupancy between groups
ustat, pval = stats.mannwhitneyu(wt_occupancy_grouper_crx.get_group(&quot;Strong enhancer&quot;), wt_occupancy_grouper_crx.get_group(&quot;Inactive&quot;), alternative=&quot;two-sided&quot;)
print(f&quot;Strong enhancers and inactive sequences have the same CRX occupancy, Mann-Whitney U test p = {pval:.0e} U = {ustat:.2f}&quot;)
ustat, pval = stats.mannwhitneyu(wt_occupancy_grouper_crx.get_group(&quot;Silencer&quot;), wt_occupancy_grouper_crx.get_group(&quot;Inactive&quot;), alternative=&quot;two-sided&quot;)
print(f&quot;Silencers and inactive sequences have the same CRX occupancy, Mann-Whitney U test p = {pval:.0e}, U = {ustat:.2f}&quot;)

# Generate the figure
gs_kw = dict(width_ratios=[1, 3])
fig, ax_list = plt.subplots(nrows=2, ncols=2, figsize=(6, 8), gridspec_kw=gs_kw)
gs = ax_list[0, 0].get_gridspec()
for ax in ax_list[0, :]:
    ax.remove()
    
axbig = fig.add_subplot(gs[0, :])
ax = axbig

# 1a: Volcano plot
fig = plot_utils.volcano_plot(activity_measured_wt_df, &quot;expression_log2_WT&quot;, &quot;expression_qvalue_WT&quot;,
                             activity_measured_wt_df[&quot;plot_color_WT&quot;], xaxis_label=&quot;log2 Enhancer Activity/Rho&quot;,
                             yaxis_label=&quot;-log10 FDR&quot;, xline=-np.log10(0.05), yline=[-1, 1],
                             xticks=rho_ticks[1:], figax=(fig, ax))
ax.set_yticks(np.arange(5))
plot_utils.add_letter(ax, -0.125, 1, &quot;a&quot;)

# 1b: CRX binding status vs. activity classes
ax = ax_list[1, 0]
fig = plot_utils.stacked_bar_plots(chip_activity_bin_freqs, &quot;Fraction of group&quot;, chip_group_names, color_mapping, figax=(fig, ax), vert=True)
ax.set_yticks(np.linspace(0, 1, 6))
plot_utils.rotate_ticks(ax.get_xticklabels())    

# Add ticks above to show the n
ax_twin = ax.twiny()
ax_twin.set_xticks(ax.get_xticks())
ax_twin.set_xlim(ax.get_xlim())
ax_twin.set_xticklabels(chip_group_count, fontsize=10, rotation=45)
plot_utils.add_letter(ax, -0.7, 1.03, &quot;b&quot;)

# 1c: Predicted CRX occupancy of different groups
ax = ax_list[1, 1]
fig = plot_utils.violin_plot_groupby(wt_occupancy_grouper_crx, &quot;Predicted CRX occupancy&quot;, class_names=wt_activity_names_oneline, class_colors=color_mapping, figax=(fig, ax))
ax.set_yticks(np.linspace(0, 8, 5))
plot_utils.rotate_ticks(ax.get_xticklabels())

# Add ticks above to show the n
ax_twin = ax.twiny()
ax_twin.set_xticks(ax.get_xticks())
ax_twin.set_xlim(ax.get_xlim())
ax_twin.set_xticklabels(wt_activity_count, fontsize=10, rotation=45)
plot_utils.add_letter(ax, -0.2, 1.03, &quot;c&quot;)
fig.tight_layout()
display(fig)
plt.close()</code></pre>
            <figure slot="outputs">
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Frequency of each activity bin in WT sequences:
</code></pre>
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Silencer           0.173615
Inactive           0.192491
Weak enhancer      0.282099
Strong enhancer    0.218005
NaN                0.133790
Name: group_name_WT, dtype: float64</code></pre>
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Frequency of activity bins vs. CRX binding status:
</code></pre>
              <table itemscope="" itemtype="http://schema.org/Table">
                <thead>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">group_name_WT</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Silencer</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Inactive</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Weak enhancer</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Strong enhancer
                    </th>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">crx_bound</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  </tr>
                </thead>
                <tbody>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">ATAC-seq</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">281</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">363</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">430</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">211</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">ChIP-seq</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">556</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">565</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">930</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">840</span></td>
                  </tr>
                </tbody>
              </table>
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>ChIP-seq status is independent of if a sequence is inactive, Fisher&#39;s exact test p=2e-07, odds ratio=1.49
ChIP-seq status is independent of if a sequence is inactive, Fisher&#39;s exact test p=1e-21, odds ratio=2.16
</code></pre>
              <table itemscope="" itemtype="http://schema.org/Table">
                <thead>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">group_name_WT</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Silencer</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Inactive</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Weak enhancer</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Strong enhancer
                    </th>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">crx_bound</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  </tr>
                </thead>
                <tbody>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">ATAC-seq</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.218677</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.28249</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.33463</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.164202</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">ChIP-seq</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.192321</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.195434</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.321688</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.290557</span></td>
                  </tr>
                </tbody>
              </table>
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Predicted CRX occupancies:
</code></pre>
              <table itemscope="" itemtype="http://schema.org/Table">
                <thead>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">count</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">mean</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">std</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">min</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">25%</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">50%</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">75%</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">max</th>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">group_name_WT</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  </tr>
                </thead>
                <tbody>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Silencer</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">837</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.822068</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.474613</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.013521</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.59851</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.724195</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.916786</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">8.028408</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Inactive</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">928</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.232489</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.342345</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.001052</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.173444</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.048457</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.136282</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">6.759976</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Weak enhancer</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1360</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.216861</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.220496</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.000385</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.235126</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.11381</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.988673</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">7.801177</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Strong enhancer
                    </td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1051</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.53401</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.16946</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.003694</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.616414</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.490314</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.285321</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">7.3685</span></td>
                  </tr>
                </tbody>
              </table>
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Strong enhancers and inactive sequences have the same CRX occupancy, Mann-Whitney U test p = 6e-10 U = 566045.00
Silencers and inactive sequences have the same CRX occupancy, Mann-Whitney U test p = 6e-17, U = 477843.00
</code></pre><img src="index.html.media/10" alt="" itemscope=""
                itemtype="http://schema.org/ImageObject">
            </figure>
          </stencila-code-chunk>
          <figcaption>
            <h4 itemscope="" itemtype="http://schema.stenci.la/Heading"
              id="activity-of-putative-cis-regulatory-sequences-with-cone-rod-homeobox-crx-motifs">
              Activity of putative <em itemscope=""
                itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory sequences with
              cone-rod homeobox (CRX) motifs.</h4>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">(<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">a</strong>) Volcano plot of activity
              scores relative to the <em itemscope=""
                itemtype="http://schema.stenci.la/Emphasis">Rho</em> promoter alone. Sequences are
              grouped as strong enhancers (dark blue), weak enhancers (light blue), inactive
              (green), silencers (red), or ambiguous (gray). Horizontal line, false discovery rate
              (FDR) q = 0.05. Vertical lines, twofold above and below <em itemscope=""
                itemtype="http://schema.stenci.la/Emphasis">Rho</em>. (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">b</strong>) Fraction of ChIP-seq and
              ATAC-seq peaks that belong to each activity group. (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">c</strong>) Predicted CRX occupancy of
              each activity group. Horizontal lines, medians; enh., enhancer. Numbers at top of
              (<strong itemscope="" itemtype="http://schema.stenci.la/Strong">b and c</strong>)
              indicate n for groups.</p>
          </figcaption>
        </figure>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Neither CRX ChIP-seq-binding
          status nor DNA accessibility as measured by ATAC-seq strongly differentiates between these
          four classes (<a href="#fig1" itemscope="" itemtype="http://schema.stenci.la/Link">Figure
            1b</a>). Compared to CRX ChIP-seq peaks, ATAC-seq peaks that lack CRX binding in the
          adult retina are slightly enriched for inactive sequences (Fisher’s exact test p = 2 ×
          10<sup itemscope="" itemtype="http://schema.stenci.la/Superscript">–7</sup>, odds ratio =
          1.5) and slightly depleted for strong enhancers (Fisher’s exact test p = 1 × 10<sup
            itemscope="" itemtype="http://schema.stenci.la/Superscript">–21</sup>, odds ratio =
          2.2). However, sequences with ChIP-seq or ATAC-seq peaks span all four activity
          categories, consistent with prior reports that DNA accessibility and TF binding data are
          not sufficient to identify functional enhancers and silencers <span itemscope=""
            itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib11"><span>11</span><span>Doni
                  Jayavelu et al.</span><span>2020</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib29"><span>29</span><span>Huang et
                  al.</span><span>2019</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib30"><span>30</span><span>Huang et
                  al.</span><span>2021</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib62"><span>62</span><span>Pang and
                  Snyder</span><span>2020</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib85"><span>85</span><span>White et
                  al.</span><span>2013</span></a></cite></span>.</p>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">We examined whether the number
          and affinity of CRX motifs differentiate enhancers, silencers, and inactive sequences by
          computing the predicted CRX occupancy (i.e. expected number of bound molecules) for each
          sequence <cite itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib85"><span>85</span><span>White et al.</span><span>2013</span></a></cite>.
          Consistent with our previous work <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib86"><span>86</span><span>White et
                al.</span><span>2016</span></a></cite>, both strong enhancers and silencers have
          higher predicted CRX occupancy than inactive sequences (Mann-Whitney U test, p = 6 ×
          10<sup itemscope="" itemtype="http://schema.stenci.la/Superscript">–10</sup> and 6 ×
          10<sup itemscope="" itemtype="http://schema.stenci.la/Superscript">–17</sup>,
          respectively, <a href="#fig1" itemscope="" itemtype="http://schema.stenci.la/Link">Figure
            1c</a>), suggesting that total CRX motif content helps distinguish silencers and strong
          enhancers from inactive sequences. However, predicted CRX occupancy does not distinguish
          strong enhancers from silencers: a logistic regression classifier trained with fivefold
          cross-validation only achieves an area under the receiver operating characteristic (AUROC)
          curve of 0.548 ± 0.023 and an area under the precision recall (AUPR) curve of 0.571 ±
          0.020 (<a href="#fig2" itemscope="" itemtype="http://schema.stenci.la/Link">Figure 2a</a>
          and <a href="#fig2ab" itemscope="" itemtype="http://schema.stenci.la/Link">Figure 2—figure
            supplement 1</a>). We thus sought to identify sequence features that distinguish strong
          enhancers from silencers.</p>
        <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
          data-execution_count="11" data-programminglanguage="python">
          <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
            slot="text"><code># Prepare data for fitting models
# Mask to pull out the silencers and strong enhancers
silencer_modeling_mask = activity_df[&quot;group_name_WT&quot;].str.contains(&quot;Strong|Silencer&quot;)
silencer_modeling_mask = silencer_modeling_mask &amp; silencer_modeling_mask.notna()
# Mask to pull out the inactive seqs and the strong enhancers
inactive_modeling_mask = activity_df[&quot;group_name_WT&quot;].str.contains(&quot;Strong|Inactive&quot;)
inactive_modeling_mask = inactive_modeling_mask &amp; inactive_modeling_mask.notna()

# Within the data to model, mask indicating which sequences are strong enhancers
labels_with_silencer = activity_df.loc[silencer_modeling_mask, &quot;group_name_WT&quot;].str.contains(&quot;Strong&quot;)
labels_with_inactive = activity_df.loc[inactive_modeling_mask, &quot;group_name_WT&quot;].str.contains(&quot;Strong&quot;)

# Write strong enhancers and silencers to file for the SVM
seq_bins_dir = os.path.join(data_dir, &quot;ActivityBins&quot;)
positives_fasta = os.path.join(seq_bins_dir, &quot;strongEnhancer.fasta&quot;)
negatives_fasta = os.path.join(seq_bins_dir, &quot;silencer.fasta&quot;)
all_strong_mask = activity_df[&quot;group_name_WT&quot;].str.contains(&quot;Strong&quot;)
all_strong_mask = all_strong_mask &amp; all_strong_mask.notna()
strong_ids = activity_df.loc[all_strong_mask, &quot;variant_WT&quot;]
fasta_seq_parse_manip.write_fasta(all_seqs[strong_ids.index + &quot;_&quot; + strong_ids], positives_fasta)
all_silencer_mask = activity_df[&quot;group_name_WT&quot;].str.contains(&quot;Silencer&quot;)
all_silencer_mask = all_silencer_mask &amp; all_silencer_mask.notna()
silencer_ids = activity_df.loc[all_silencer_mask, &quot;variant_WT&quot;]
fasta_seq_parse_manip.write_fasta(all_seqs[silencer_ids.index + &quot;_&quot; + silencer_ids], negatives_fasta)

# Fit k-mer SVM
print(&quot;Fitting k-mer Supper Vector Machine. This will take a few minutes.&quot;)
# Hyperparameter setup
seed = 1210
word_len = 6
max_mis = 1
nfolds = 5

models_dir = &quot;Models&quot;
svm_dir = os.path.join(models_dir, &quot;StrongEnhancerVsSilencer&quot;)
if not os.path.exists(svm_dir):
    os.makedirs(svm_dir)

# Fit the SVM
svm_prefix = os.path.join(svm_dir, f&quot;gkmsvm_{word_len}_{word_len}_{max_mis}&quot;)
fig_list, xaxis, svm_tpr, svm_prec, svm_f1, svm_scores = gkmsvm.train_with_cv(positives_fasta, negatives_fasta, svm_prefix, num_folds=nfolds, word_len=word_len, info_pos=word_len, max_mis=max_mis, seed=seed)
plt.close()

# Fit logistic regression models
print(&quot;Fitting strong enhancer vs. silencer logistic regression model for CRX occupancy.&quot;)
cv = StratifiedKFold(n_splits=nfolds, shuffle=True, random_state=seed)
crx_clf = LogisticRegression()
crx_clf, crx_tpr_list, crx_prec_list, crx_f1_list = modeling.train_estimate_variance(crx_clf, cv, wt_occupancy_df.loc[silencer_modeling_mask, &quot;CRX&quot;], labels_with_silencer, xaxis, positive_cutoff=0)

print(&quot;Fitting strong enhancer vs. silencer logistic regression model for 8 TFs.&quot;)
occ_clf = LogisticRegression()
param_grid = {&quot;C&quot;: np.logspace(-4, 4, 9)}
np.random.seed(seed)
occ_clf, occ_tpr_list, occ_prec_list = modeling.grid_search_hyperparams(occ_clf, nfolds, param_grid, &quot;f1&quot;, wt_occupancy_df[silencer_modeling_mask], labels_with_silencer, xaxis, positive_cutoff=0)
c_opt = occ_clf.get_params()[&quot;C&quot;]
print(f&quot;Optimal regularization strength (C): {c_opt:1.1e}&quot;)</code></pre>
          <figure slot="outputs">
            <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Fitting k-mer Supper Vector Machine. This will take a few minutes.
Fitting strong enhancer vs. silencer logistic regression model for CRX occupancy.
Fitting strong enhancer vs. silencer logistic regression model for 8 TFs.
Optimal regularization strength (C): 1.0e-02
</code></pre><img src="index.html.media/11" alt="" itemscope=""
              itemtype="http://schema.org/ImageObject">
          </figure>
        </stencila-code-chunk>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig2" title="Figure 2.">
          <label data-itemprop="label">Figure 2.</label>
          <figcaption>
            <h4 itemscope="" itemtype="http://schema.stenci.la/Heading"
              id="strong-enhancers-contain-a-diverse-array-of-motifs">Strong enhancers contain a
              diverse array of motifs.</h4>
          </figcaption>
        </figure>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig2ab"
          title="Figure 2a and b, and Figure 2—figure supplement 1"><label
            data-itemprop="label">Figure 2a and b, and Figure 2—figure supplement 1</label>
          <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
            data-execution_count="12" data-programminglanguage="python">
            <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
              slot="text"><code># Generate the figure -- this has to be done in a few pieces
modeling_xaxis = np.linspace(0, 1, 100)
fig, ax_list = plot_utils.setup_multiplot(2, sharex=False, sharey=False)
# Separate figure handle for the PR curves
fig_pr, ax_pr = plt.subplots()

# 2a and supplemental figure 3: ROC and PR curves with SVM, TF occupancies, CRX occupancy
model_data = [ # (TPR, precision, name, color)
    (svm_tpr, svm_prec, &quot;SVM&quot;, &quot;black&quot;),
    (occ_tpr_list, occ_prec_list, f&quot;{n_tfs} TFs&quot;, &quot;#E69B04&quot;),
    (crx_tpr_list, crx_prec_list, &quot;CRX&quot;, &quot;#009980&quot;)
]

model_tprs, model_precs, model_names, model_colors = zip(*model_data)
prc_chance = activity_df[&quot;group_name_WT&quot;].str.contains(&quot;Strong&quot;).sum() / activity_df[&quot;group_name_WT&quot;].str.contains(&quot;Strong|Silencer&quot;).sum()

# Generate figures
_, model_aurocs, model_aurocs_std, model_auprs, model_auprs_std = plot_utils.roc_pr_curves(
    modeling_xaxis, model_tprs, model_precs, model_names, model_colors=model_colors,
    prc_chance=prc_chance, figax=([fig, fig_pr], [ax_list[0], ax_pr])
)
ax_list[0].set_xticks(np.linspace(0, 1, 6))
plot_utils.add_letter(ax_list[0], -0.25, 1.03, &quot;a&quot;)

# Display model metrics
print(&quot;Model metrics:&quot;)
for name, auroc, auroc_std, aupr, aupr_std in zip(model_names, model_aurocs, model_aurocs_std, model_auprs, model_auprs_std):
    print(f&quot;{name}\tAUROC={auroc:.3f}+/-{auroc_std:.3f}\tAUPR={aupr:.3f}+/-{aupr_std:.3f}&quot;)

# Calculate total predicted occupancy of each class
wt_entropy_grouper = wt_entropy_df.groupby(activity_df[&quot;group_name_WT&quot;])
print(&quot;Total predicted occupancy of all TFs in each group:&quot;)
display(wt_entropy_grouper[&quot;total_occupancy&quot;].describe())

# 2b: Total predicted occupancy of each class
ax = ax_list[1]
fig = plot_utils.violin_plot_groupby(wt_entropy_grouper[&quot;total_occupancy&quot;], &quot;Total predicted TF occupancy&quot;, class_names=wt_activity_names_oneline, class_colors=color_mapping, figax=(fig, ax))
plot_utils.rotate_ticks(ax.get_xticklabels())
plot_utils.add_letter(ax, -0.25, 1.03, &quot;b&quot;)

# Add ticks above to show the n
ax_twin = ax.twiny()
ax_twin.set_xticks(ax.get_xticks())
ax_twin.set_xlim(ax.get_xlim())
ax_twin.set_xticklabels(wt_activity_count, fontsize=10, rotation=45)

print(&quot;Figure 2, panels A and B:&quot;)
fig.tight_layout()
display(fig)
print(&quot;Figure 2--figure supplement 1:&quot;)
display(fig_pr)
plt.close()
plt.close()</code></pre>
            <figure slot="outputs">
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Model metrics:
SVM	AUROC=0.781+/-0.013	AUPR=0.812+/-0.020
8 TFs	AUROC=0.698+/-0.036	AUPR=0.745+/-0.032
CRX	AUROC=0.548+/-0.023	AUPR=0.571+/-0.020
Total predicted occupancy of all TFs in each group:
</code></pre>
              <table itemscope="" itemtype="http://schema.org/Table">
                <thead>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">count</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">mean</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">std</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">min</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">25%</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">50%</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">75%</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">max</th>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">group_name_WT</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  </tr>
                </thead>
                <tbody>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Silencer</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">837</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.588419</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.848387</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.067069</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.167386</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.408131</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">4.845272</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">11.848887</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Inactive</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">928</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.005903</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.690368</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.03447</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.777625</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.810142</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.968906</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">12.011682</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Weak enhancer</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1360</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.068334</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.582532</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.010029</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.935493</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.921969</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">4.031018</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">12.521734</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Strong enhancer
                    </td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1051</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.782727</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.622289</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.02116</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.577761</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.664645</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">4.762179</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">10.185356</span></td>
                  </tr>
                </tbody>
              </table>
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Figure 2, panels A and B:
</code></pre><img src="index.html.media/12" alt="" itemscope=""
                itemtype="http://schema.org/ImageObject">
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Figure 2--figure supplement 1:
</code></pre><img src="index.html.media/13" alt="" itemscope=""
                itemtype="http://schema.org/ImageObject">
            </figure>
          </stencila-code-chunk>
          <figcaption>
            <h4 itemscope="" itemtype="http://schema.stenci.la/Heading" id="figure-2">Figure 2</h4>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">(<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">a</strong>) Receiver operating
              characteristic for classifying strong enhancers from silencers. Solid black, 6-mer
              support vector machine (SVM); orange, eight transcription factors (TFs) predicted
              occupancy logistic regression; aqua, predicted cone-rod homeobox (CRX) occupancy
              logistic regression; dashed black, chance; shaded area, 1 standard deviation based on
              fivefold cross-validation. (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">b</strong>) Total predicted TF occupancy
              in each activity class.</p>
            <h4 itemscope="" itemtype="http://schema.stenci.la/Heading"
              id="figure-2-figure-supplement-1-precision-recall-curve-for-strong-enhancer-vs-silencer-classifiers">
              Figure 2-figure supplement 1. Precision recall curve for strong enhancer vs. silencer
              classifiers.</h4>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Solid black, 6-mer support
              vector machine (SVM); orange, eight transcription factors (TFs) predicted occupancy
              logistic regression; aqua, predicted cone-rod homeobox (CRX) occupancy logistic
              regression; dashed black, chance; shaded area, 1 standard deviation based on fivefold
              cross-validation.</p>
          </figcaption>
        </figure>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig2c" title="Figure 2c">
          <label data-itemprop="label">Figure 2c</label>
          <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
            data-execution_count="13" data-programminglanguage="python">
            <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
              slot="text"><code># Calculate motif frequency in each class
occupied_cutoff = 0.5
motif_freq_df = wt_occupancy_grouper.apply(lambda x: (x &gt; occupied_cutoff).sum() / len(x))
# Sort by the feature importance in the logistic model
feature_importance = occ_clf.coef_[0]
feature_order = feature_importance.argsort()
motif_freq_df = motif_freq_df.iloc[:, feature_order]

# Make the fig
fig, ax_list = plt.subplots(nrows=8, ncols=2, figsize=(6, 4), gridspec_kw=dict(width_ratios=[1, 2]))
gs = ax_list[0, 0].get_gridspec()
for ax in ax_list[:, 1]:
    ax.remove()
    
axbig = fig.add_subplot(gs[:, 1])

ax = axbig
vmax = 0.25
thresh = vmax / 2
motif_freq_no_crx_df = motif_freq_df.drop(columns=&quot;CRX&quot;)
heatmap = ax.imshow(motif_freq_no_crx_df.T, aspect=&quot;auto&quot;, vmin=0, vmax=vmax, cmap=&quot;Reds&quot;)
ax.set_xticks(np.arange(len(wt_activity_names_oneline)))
ax.set_xticklabels(wt_activity_names_oneline, rotation=90)
ax.set_yticks(np.arange(len(motif_freq_no_crx_df.columns)))
ax.set_yticklabels(motif_freq_no_crx_df.columns)
plot_utils.annotate_heatmap(ax, motif_freq_no_crx_df, thresh)

# Add the logos
for cax, tf in zip(ax_list[1:, 0], motif_freq_no_crx_df.columns):
    pwm = logomaker.transform_matrix(pwms[tf], from_type=&quot;probability&quot;, to_type=&quot;information&quot;)
    logomaker.Logo(pwm, ax=cax, color_scheme=&quot;colorblind_safe&quot;, show_spines=False)
    # Right-align the logos
    cax.set_xlim(left=motif_len[tf] - motif_len.max() - 0.5)
    cax.set_ylim(top=2)
    cax.set_xticks([])
    cax.set_yticks([])

# Add a colorbar
divider = make_axes_locatable(ax)
cax = divider.append_axes(&quot;right&quot;, size=&quot;5%&quot;, pad=&quot;2%&quot;)
colorbar = fig.colorbar(heatmap, cax=cax, label=&quot;Frequency of motif&quot;)
ticks = cax.get_yticks()
ticks = [f&quot;{i:.2f}&quot; for i in ticks]
ticks[-1] = r&quot;$\geq$&quot; + ticks[-1]
cax.set_yticklabels(ticks)

# Add CRX
cax = divider.append_axes(&quot;top&quot;, size=&quot;14%&quot;, pad=&quot;2%&quot;)
heatmap = cax.imshow(motif_freq_df[&quot;CRX&quot;].to_frame().T, aspect=&quot;auto&quot;, vmin=0, vmax=vmax, cmap=&quot;Reds&quot;)
cax.xaxis.tick_top()
cax.set_xticks(ax.get_xticks())
cax.set_xlim(ax.get_xlim())
cax.set_xticklabels(wt_activity_count, fontsize=10, rotation=45)
cax.set_yticks([0])
cax.set_yticklabels([&quot;CRX&quot;])
plot_utils.annotate_heatmap(cax, motif_freq_df[&quot;CRX&quot;].to_frame(), thresh)

# Add CRX logo
cax = ax_list[0, 0]
pwm = logomaker.transform_matrix(pwms[&quot;CRX&quot;], from_type=&quot;probability&quot;, to_type=&quot;information&quot;)
logomaker.Logo(pwm, ax=cax, color_scheme=&quot;colorblind_safe&quot;, show_spines=False)
# Right-align the logos
cax.set_xlim(left=motif_len[tf] - motif_len.max() - 0.5)
cax.set_ylim(top=2)
cax.set_xticks([])
cax.set_yticks([])

plot_utils.add_letter(cax, 0, 1.03, &quot;c&quot;)
print(&quot;Figure 2c&quot;)
fig.tight_layout(pad=0)
display(fig)
plt.close()</code></pre>
            <figure slot="outputs">
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>in validate_matrix(): Row sums in df are not close to 1. Reormalizing rows...
Figure 2c
</code></pre><img src="index.html.media/14" alt="" itemscope=""
                itemtype="http://schema.org/ImageObject">
            </figure>
          </stencila-code-chunk>
          <figcaption>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">(<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">c</strong>) Frequency of TF motifs in each
              activity class.</p>
          </figcaption>
        </figure>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig2def"
          title="Figure 2d, e, and f"><label data-itemprop="label">Figure 2d, e, and f</label>
          <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
            data-execution_count="14" data-programminglanguage="python">
            <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
              slot="text"><code># Setup figure
fig, ax_list = plt.subplots(nrows=2, ncols=2, figsize=(8, 4), gridspec_kw=dict(height_ratios=[3, 2]))
ax2d = ax_list[0, 0]
ax2f = ax_list[1, 0]
for ax in ax_list[:, 1]:
    ax.remove()

ax2e = fig.add_subplot(ax2d.get_gridspec()[:, 1])

# Calculate co-occurrance of motifs in strong enhancers
strong_enh_coocc_df = wt_occupancy_grouper.get_group(&quot;Strong enhancer&quot;)[[&quot;RAX&quot;, &quot;NRL&quot;, &quot;MAZ&quot;, &quot;NDF1&quot;, &quot;RORB&quot;]]
strong_enh_coocc_df = (strong_enh_coocc_df &gt; occupied_cutoff).astype(int)
strong_enh_coocc_df = strong_enh_coocc_df.T.dot(strong_enh_coocc_df) / len(strong_enh_coocc_df)
# Fill in lower triangle with the expected values
for row in range(len(strong_enh_coocc_df)):
    for col in range(row + 1, len(strong_enh_coocc_df)):
        strong_enh_coocc_df.iloc[row, col] = strong_enh_coocc_df.iloc[row, row] * strong_enh_coocc_df.iloc[col, col]
        
# 2d: Make the heatmap
ax = ax2d
vmax = 0.25
thresh = vmax / 2
heatmap = ax.imshow(strong_enh_coocc_df, aspect=&quot;auto&quot;, cmap=&quot;Reds&quot;, vmax=vmax, vmin=0)
ax.set_title(&quot;Strong enhancers&quot;)
ax.set_xticks(np.arange(len(strong_enh_coocc_df.columns)))
ax.set_xticklabels(strong_enh_coocc_df.columns)
ax.set_yticks(np.arange(len(strong_enh_coocc_df.columns)))
ax.set_yticklabels(strong_enh_coocc_df.columns)
plot_utils.annotate_heatmap(ax, strong_enh_coocc_df, thresh, adjust_lower_triangle=True)

# Add colorbar
divider = make_axes_locatable(ax)
cax = divider.append_axes(&quot;right&quot;, size=&quot;5%&quot;, pad=&quot;2%&quot;)
colorbar = fig.colorbar(heatmap, cax=cax, label=&quot;Freq. motifs\nco-occur&quot;, ticks=[0, round(thresh, 2), vmax])
plot_utils.add_letter(ax, -0.25, 1.03, &quot;d&quot;)

# Calculate activity classes for different binding combos
binding_combos_activity_freq = activity_measured_wt_df.groupby(&quot;binding_group&quot;)[&quot;group_name_WT&quot;].value_counts().unstack()
binding_combos_activity_freq = binding_combos_activity_freq[class_sort_order]
# Ignore cases where there is NRL or MEF2D but not CRX
binding_combos_activity_freq = binding_combos_activity_freq.loc[[&quot;No binding&quot;, &quot;CRX only&quot;, &quot;CRX+NRL&quot;, &quot;CRX+MEF2D&quot;, &quot;All three&quot;]]
binding_combos_activity_freq = binding_combos_activity_freq.astype(int)

# Generate names then normalize data
binding_combos_names = binding_combos_activity_freq.index.values
binding_combos_count = [j.sum() for i, j in binding_combos_activity_freq.iterrows()]
binding_combos_activity_freq = binding_combos_activity_freq.div(binding_combos_activity_freq.sum(axis=1), axis=0)
display(binding_combos_activity_freq)

# 2e: make plot
ax = ax2e
fig = plot_utils.stacked_bar_plots(binding_combos_activity_freq, &quot;Fraction of group&quot;, binding_combos_names, color_mapping, figax=(fig, ax), vert=True)
ax.set_yticks(np.linspace(0, 1, 6))
plot_utils.rotate_ticks(ax.get_xticklabels())

# Add the n
ax_twin = ax.twiny()
ax_twin.set_xticks(ax.get_xticks())
ax_twin.set_xlim(ax.get_xlim())
ax_twin.set_xticklabels(binding_combos_count, fontsize=10, rotation=45)
plot_utils.add_letter(ax, -0.25, 1.03, &quot;e&quot;)

# Frequency each class is bound by each TF
group_bound_freqs = activity_measured_wt_df.groupby(&quot;group_name_WT&quot;)[[&quot;crx_bound&quot;, &quot;nrl_bound&quot;, &quot;mef2d_bound&quot;]].apply(lambda x: x.sum() / len(x))
group_bound_freqs.columns = group_bound_freqs.columns.str.split(&quot;_&quot;).str[0].str.upper()

# 2f: Make heatmakt
vmax = 1
thresh = vmax / 2
ax = ax2f
heatmap = ax.imshow(group_bound_freqs.T, aspect=&quot;auto&quot;, cmap=&quot;Reds&quot;, vmax=vmax, vmin=0)
ax.set_xticks(np.arange(len(wt_activity_names_oneline)))
ax.set_xticklabels(wt_activity_names_oneline, rotation=90)
ax.set_yticks(np.arange(len(group_bound_freqs.columns)))
ax.set_yticklabels(group_bound_freqs.columns)
plot_utils.annotate_heatmap(ax, group_bound_freqs, thresh)

# Add colorbar
divider = make_axes_locatable(ax)
cax = divider.append_axes(&quot;right&quot;, size=&quot;5%&quot;, pad=&quot;2%&quot;)
colorbar = fig.colorbar(heatmap, cax=cax, label=&quot;Fraction\nbound&quot;)
plot_utils.add_letter(ax, -0.25, 1.03, &quot;f&quot;)

# Add ticks above to show the n
ax_twin = ax.twiny()
ax_twin.set_axes_locator(ax.get_axes_locator())
ax_twin.set_xticks(ax.get_xticks())
ax_twin.set_xlim(ax.get_xlim())
ax_twin.set_xticklabels(wt_activity_count, fontsize=10, rotation=45)

print(&quot;Figure 2, panels D-F&quot;)
fig.tight_layout(pad=0)
display(fig)
plt.close()</code></pre>
            <figure slot="outputs">
              <table itemscope="" itemtype="http://schema.org/Table">
                <thead>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">group_name_WT</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Silencer</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Inactive</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Weak enhancer</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">Strong enhancer
                    </th>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">binding_group</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  </tr>
                </thead>
                <tbody>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">No binding</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.221493</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.2863</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.331419</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.160788</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">CRX only</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.203553</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.222276</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.346615</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.227556</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">CRX+NRL</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.19256</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.115974</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.238512</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.452954</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">CRX+MEF2D</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.145</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.165</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.28</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.41</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">All three</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.099338</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.10596</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.284768</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.509934</span></td>
                  </tr>
                </tbody>
              </table>
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Figure 2, panels D-F
</code></pre><img src="index.html.media/15" alt="" itemscope=""
                itemtype="http://schema.org/ImageObject">
            </figure>
          </stencila-code-chunk>
          <figcaption>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">(<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">d</strong>) Frequency of co-occurring TF
              motifs in strong enhancers. Lower triangle is expected co-occurrence if motifs are
              independent. (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">e</strong>) Frequency of activity classes,
              colored as in (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">b</strong>), for sequences in CRX, NRL,
              and/or MEF2D ChIP-seq peaks. (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">f</strong>) Frequency of TF ChIP-seq peaks
              in activity classes. TFs in (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">c</strong>) are sorted by feature
              importance of the logistic regression model in (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">a</strong>).</p>
          </figcaption>
        </figure>
        <h3 itemscope="" itemtype="http://schema.stenci.la/Heading"
          id="lineage-defining-tf-motifs-differentiate-strong-enhancers-from-silencers">
          Lineage-defining TF motifs differentiate strong enhancers from silencers</h3>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">We performed a de novo motif
          enrichment analysis to identify motifs that distinguish strong enhancers from silencers
          and found several differentially enriched motifs matching known TFs. For motifs that
          matched multiple TFs, we selected one representative TF for downstream analysis, since TFs
          from the same family have PWMs that are too similar to meaningfully distinguish between
          motifs for these TFs (<a href="#fig2s2" itemscope=""
            itemtype="http://schema.stenci.la/Link">Figure 2—figure supplement 2</a>, Materials and
          methods). Strong enhancers are enriched for several motif families that include TFs that
          interact with CRX or are important for photoreceptor development: NeuroD1/NDF1
          (E-box-binding bHLH) <cite itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib59"><span>59</span><span>Morrow et al.</span><span>1999</span></a></cite>,
          RORB (nuclear receptor) <span itemscope=""
            itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib36"><span>36</span><span>Jia et
                  al.</span><span>2009</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib79"><span>79</span><span>Srinivas
                  et al.</span><span>2006</span></a></cite></span>, MAZ or Sp4 (C2H2 zinc finger)
          <cite itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib51"><span>51</span><span>Lerner et al.</span><span>2005</span></a></cite>,
          and NRL (bZIP) <span itemscope="" itemtype="http://schema.stenci.la/CiteGroup"><cite
              itemscope="" itemtype="http://schema.stenci.la/Cite"><a
                href="#bib55"><span>55</span><span>Mears et
                  al.</span><span>2001</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib56"><span>56</span><span>Mitton
                  et al.</span><span>2000</span></a></cite></span>. Sp4 physically interacts with
          CRX in the retina <cite itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib51"><span>51</span><span>Lerner et al.</span><span>2005</span></a></cite>,
          but we chose to represent the zinc finger motif with MAZ because it has a higher quality
          score in the HOCOMOCO database <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a
              href="#bib46"><span>46</span><span>Kulakovskiy et
                al.</span><span>2018</span></a></cite>. Silencers were enriched for a motif that
          resembles a partial K50 homeodomain motif but instead matches the zinc finger TF GFI1, a
          member of the Snail repressor family <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib8"><span>8</span><span>Chiang and
                Ayyanathan</span><span>2013</span></a></cite> expressed in developing retinal
          ganglion cells <cite itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib88"><span>88</span><span>Yang et al.</span><span>2003</span></a></cite>.
          Therefore, while strong enhancers and silencers are not distinguished by their CRX motif
          content, strong enhancers are uniquely enriched for several lineage-defining TFs.</p>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">To quantify how well these TF
          motifs differentiate strong enhancers from silencers, we trained two different
          classification models with fivefold cross-validation. First, we trained a 6-mer support
          vector machine (SVM) <cite itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib19"><span>19</span><span>Ghandi et al.</span><span>2014</span></a></cite>
          and achieved an AUROC of 0.781 ± 0.013 and AUPR of 0.812 ± 0.020 (<a href="#fig2"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 2a</a> and <a href="#fig2ab"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 2—figure supplement 1</a>).
          The SVM considers all 2080 non-redundant 6-mers and provides an upper bound to the
          predictive power of models that do not consider the exact arrangement or spacing of
          sequence features. We next trained a logistic regression model on the predicted occupancy
          for eight lineage-defining TFs (<a href="#supp4" itemscope=""
            itemtype="http://schema.stenci.la/Link">Supplementary file 4</a>) and compared it to the
          upper bound established by the SVM. In this model, we considered CRX, the five TFs
          identified in our motif enrichment analysis, and two additional TFs enriched in
          photoreceptor ATAC-seq peaks <cite itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib31"><span>31</span><span>Hughes et al.</span><span>2017</span></a></cite>:
          RAX, a Q50 homeodomain TF that contrasts with CRX, a K50 homeodomain TF <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib34"><span>34</span><span>Irie et
                al.</span><span>2015</span></a></cite> and MEF2D, a MADS box TF which co-binds with
          CRX <cite itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib2"><span>2</span><span>Andzelm et al.</span><span>2015</span></a></cite>.
          The logistic regression model performs nearly as well as the SVM (AUROC 0.698 ± 0.036,
          AUPR 0.745 ± 0.032, <a href="#fig2" itemscope=""
            itemtype="http://schema.stenci.la/Link">Figure 2a</a> and <a href="#fig2ab" itemscope=""
            itemtype="http://schema.stenci.la/Link">Figure 2—figure supplement 1</a>) despite a
          260-fold reduction from 2080 to 8 features. To determine whether the logistic regression
          model depends specifically on the eight lineage-defining TFs, we established a null
          distribution by fitting 100 logistic regression models with randomly selected TFs
          (Materials and methods). Our logistic regression model outperforms the null distribution
          (one-tailed Z-test for AUROC and AUPR, p &lt; 0.0008, <a href="#fig2s3" itemscope=""
            itemtype="http://schema.stenci.la/Link">Figure 2—figure supplement 3</a>), indicating
          that the performance of the model specifically requires the eight lineage-defining TFs. To
          determine whether the SVM identified any additional motifs that could be added to the
          logistic regression model, we generated de novo motifs using the SVM <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">k</em>-mer scores and found no additional
          motifs predictive of strong enhancers. Finally, we found that our two models perform
          similarly on an independent test set of CRX-targeted sequences (<cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib85"><span>85</span><span>White et
                al.</span><span>2013</span></a></cite>; <a href="#fig2s3" itemscope=""
            itemtype="http://schema.stenci.la/Link">Figure 2—figure supplement 3</a>). Since the
          logistic regression model performs near the upper bound established by the SVM and depends
          specifically on the eight selected motifs, we conclude that these motifs comprise nearly
          all of the sequence features captured by the SVM that distinguish strong enhancers from
          silencers among CRX-targeted sequences.</p>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig2s2"
          title="Figure 2—figure supplement 2."><label data-itemprop="label">Figure 2—figure
            supplement 2.</label><img src="index.html.media/fig2-figsupp2.jpg" alt="" itemscope=""
            itemtype="http://schema.org/ImageObject">
          <figcaption>
            <h4 itemscope="" itemtype="http://schema.stenci.la/Heading"
              id="results-from-de-novo-motif-analysis">Results from de novo motif analysis.</h4>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Motifs enriched in strong
              enhancers (<strong itemscope="" itemtype="http://schema.stenci.la/Strong">a</strong>)
              and silencers (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">b</strong>). Bottom, de novo motif
              identified with DREME; top, matched known motif identified with TOMTOM.</p>
          </figcaption>
        </figure>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig2s3"
          title="Figure 2—figure supplement 3."><label data-itemprop="label">Figure 2—figure
            supplement 3.</label>
          <figcaption>
            <h4 itemscope="" itemtype="http://schema.stenci.la/Heading"
              id="additional-validation-of-the-eight-transcription-factors-tfs-predicted-occupancy-logistic-regression-model">
              Additional validation of the eight transcription factors (TFs) predicted occupancy
              logistic regression model.</h4>
          </figcaption>
        </figure>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig2s3ab"
          title="Figure 2—figure supplement 3 a and b."><label data-itemprop="label">Figure 2—figure
            supplement 3 a and b.</label>
          <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
            data-execution_count="15" data-programminglanguage="python">
            <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
              slot="text"><code>print(&quot;Only panels A and B are shown here. Generating the data for panels C and D will take approximately 50 minutes. If you are interested in generating these panels, the code is in the next cell, but commented out.&quot;)
white_data_dir = os.path.join(&quot;Data&quot;, &quot;Downloaded&quot;, &quot;CrxMpraLibraries&quot;)
white_seqs = pd.read_csv(os.path.join(white_data_dir, &quot;white2013Sequences.txt&quot;), sep=&quot;\t&quot;, header=None, usecols=[0, 8], index_col=0, squeeze=True, names=[&quot;label&quot;, &quot;sequence&quot;])
# Only keep barcode1 sequences since barcode info isn&#39;t needed
bc_tag = &quot;_barcode1&quot;
white_seqs = white_seqs[white_seqs.index.str.contains(bc_tag)]
# Trim off the barcode ID
white_seqs = white_seqs.rename(lambda x: x[1:-len(bc_tag)])
# Only keep the 84 bp of the sequence that corresponds to the library
seq_len = 84
seq_start = len(&quot;TAGCGTCTGTCCGTGAATTC&quot;) + 1
white_seqs = white_seqs.str[seq_start:seq_start+seq_len]
# Function to correct off by one error in labeling
def correct_label(name):
    chrom, pos, group = name.split(&quot;_&quot;)
    pos = int(pos) + 1
    return &quot;_&quot;.join([chrom, str(pos), group])

white_activity_df = pd.read_csv(os.path.join(white_data_dir, &quot;white2013Activity.txt&quot;), sep=&quot;\t&quot;, index_col=0, usecols=[0, 1, 2, 3], names=[&quot;label&quot;, &quot;class&quot;, &quot;expression&quot;, &quot;expression_SEM&quot;], header=0)
# Correct the off by one error of the labels
white_activity_df = white_activity_df.rename(correct_label)
white_activity_df[&quot;expression_log2&quot;] = np.log2(white_activity_df[&quot;expression&quot;])

white_measured_seqs = white_seqs[white_activity_df.index]

print(&quot;Computing predicted occupancy of all TFs on the test set.&quot;)
white_occupancy_df = predicted_occupancy.all_seq_total_occupancy(white_measured_seqs, ewms, mu, convert_ewm=False)
print(&quot;Done computing predicted occupancy.&quot;)
display(white_occupancy_df.head())

# Define cutoffs
scrambled_mask = white_activity_df[&quot;class&quot;].str.contains(&quot;SCR&quot;)
strong_cutoff = white_activity_df.loc[scrambled_mask, &quot;expression_log2&quot;].quantile(0.95)
white_scrambled_mean = white_activity_df.loc[scrambled_mask, &quot;expression_log2&quot;].mean()

# Pull out bound sequences
bound_mask = white_activity_df[&quot;class&quot;].str.match(&quot;CBR(M|NO)$&quot;)
bound_activity_df = white_activity_df[bound_mask].copy()
bound_occupancy_df = white_occupancy_df[bound_mask]

# Pull out relevant sequences
white_strong_mask = bound_activity_df[&quot;expression_log2&quot;] &gt; strong_cutoff
white_silencer_mask = bound_activity_df[&quot;expression_log2&quot;] &lt; (white_scrambled_mean - 1)
white_modeling_mask = white_strong_mask | white_silencer_mask
white_labels = white_strong_mask[white_modeling_mask]

# Make predictions
print(&quot;Making predictions on the test set with the SVM and 8 TF logistic regression model.&quot;)
# Write sequences to file for the SVM
white_modeling_seqs = white_seqs[bound_activity_df.index][white_modeling_mask]
white_modeling_fasta = os.path.join(svm_dir, &quot;white2013TestSet.fasta&quot;)
fasta_seq_parse_manip.write_fasta(white_modeling_seqs, white_modeling_fasta)

# SVM
svm_white_tpr, svm_white_prec, svm_white_scores, svm_white_f1 = gkmsvm.predict_and_eval(white_modeling_fasta, white_labels, svm_prefix, word_len, word_len, max_mis, xaxis)

# Logistic model
occupancy_probs = occ_clf.predict_proba(bound_occupancy_df[white_modeling_mask])
occupancy_white_tpr, occupancy_white_prec, occupancy_white_f1 = modeling.calc_tpr_precision_fbeta(white_labels, occupancy_probs[:, 1], xaxis, positive_cutoff=0.5)

# Setup figure
fig, ax_list = plot_utils.setup_multiplot(2, n_cols=2, sharex=False, sharey=False)

# Plot White 2013 test set
_, white_aurocs, _, white_auprs, _ = plot_utils.roc_pr_curves(
    modeling_xaxis, [svm_white_tpr, occupancy_white_tpr], [svm_white_prec, occupancy_white_prec],
    model_names[:2], model_colors=model_colors[:2], prc_chance=svm_white_prec[-1],
    figax=([fig, fig], ax_list)
)

plot_utils.add_letter(ax_list[0], -0.15, 1.03, &quot;a&quot;)
plot_utils.add_letter(ax_list[1], -0.15, 1.03, &quot;b&quot;)

# Display model performance
print(&quot;Model performance on White 2013 test set:&quot;)
print(f&quot;{model_names[0]}\tAUROC = {white_aurocs[0]:.3f}\tAUPR = {white_auprs[0]:.3f}&quot;)
print(f&quot;{model_names[1]}\tAUROC = {white_aurocs[1]:.3f}\tAUPR = {white_auprs[1]:.3f}&quot;)
fig.tight_layout()
display(fig)
plt.close()</code></pre>
            <figure slot="outputs">
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Only panels A and B are shown here. Generating the data for panels C and D will take approximately 50 minutes. If you are interested in generating these panels, the code is in the next cell, but commented out.
Computing predicted occupancy of all TFs on the test set.
Done computing predicted occupancy.
</code></pre>
              <table itemscope="" itemtype="http://schema.org/Table">
                <thead>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">CRX</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">GFI1</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">MAZ</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">MEF2D</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">NDF1</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">NRL</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RORB</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">RAX</th>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">label</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  </tr>
                </thead>
                <tbody>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                      chr1_100559800_SCRUBR</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.274096</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.545296e-13</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.630613e-11</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">4.707551e-14</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.017487e-7</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.000854</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.00004694361</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.008889</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">chr1_100559800_UBR
                    </td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.178397</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">5.862032e-11</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.000001102815</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.221394e-10</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.001066875</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.000541</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">8.777171e-7</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.001608</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">chr1_100750470_UBR
                    </td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.430898</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">8.232504e-7</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">5.564299e-11</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.960941e-10</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.01272582</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.969272</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.000001295348</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.001267</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">chr1_108920170_UBR
                    </td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.072197</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.00732386</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">6.147587e-16</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">4.758899e-9</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.658399e-10</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.808744</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.005559077</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.003341</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">
                      chr1_11177090_SCRUBR</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.214338</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.0004034044</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">4.444271e-14</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.389581e-7</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.62783e-10</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.000005</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.001550753</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.118118</span></td>
                  </tr>
                </tbody>
              </table>
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Making predictions on the test set with the SVM and 8 TF logistic regression model.
Model performance on White 2013 test set:
SVM	AUROC = 0.800	AUPR = 0.821
8 TFs	AUROC = 0.662	AUPR = 0.714
</code></pre><img src="index.html.media/16" alt="" itemscope=""
                itemtype="http://schema.org/ImageObject">
            </figure>
          </stencila-code-chunk>
          <figcaption>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Predictions of the 6-mer
              support vector machine (SVM) (black) and eight TFs predicted occupancy logistic
              regression model (orange) on an independent test set. (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">a</strong>) Receiver operating
              characteristic, (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">b</strong>) precision recall curve. Dashed
              black line represents chance in both panels.</p>
          </figcaption>
        </figure>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig2s3cd_static"
          title="Figure 2—figure supplement 3c and d, static."><label data-itemprop="label">Figure
            2—figure supplement 3c and d, static.</label><img
            src="index.html.media/fig2-figsupp3.jpg" alt="" itemscope=""
            itemtype="http://schema.org/ImageObject">
          <figcaption>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Static version of the
              figure to display panels (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">c</strong>) and (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">d</strong>). Null distribution of 100
              logistic regression models trained using randomly selected motifs (gray) compared to
              the true features (orange). Shaded area, 1 standard deviation based on fivefold
              cross-validation. (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">c</strong>) Receiver operating
              characteristic, (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">d</strong>) precision recall curve. Dashed
              black line represents chance in both panels.</p>
          </figcaption>
        </figure>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig2s3cd_int"
          title="Figure 2—figure supplement 3c and d, interactive."><label
            data-itemprop="label">Figure 2—figure supplement 3c and d, interactive.</label>
          <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
            data-execution_count="16" data-programminglanguage="python">
            <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
              slot="text"><code># # Read in HOCOMOCO database
# hocomoco = predicted_occupancy.read_pwm_files(os.path.join(&quot;Data&quot;, &quot;Downloaded&quot;, &quot;Pwm&quot;, &quot;photoreceptorMotifsAndHOCOMOCOv11_full_MOUSE.meme&quot;))
# hocomoco = hocomoco.apply(predicted_occupancy.ewm_from_letter_prob).apply(predicted_occupancy.ewm_to_dict)

# wt_seqs = all_seqs[all_seqs.index.str.contains(&quot;WT&quot;)].copy()
# wt_seqs = sequence_annotation_processing.remove_mutations_from_seq_id(wt_seqs)
# wt_seqs = wt_seqs[activity_df.index]
# modeling_seqs = wt_seqs[silencer_modeling_mask]

# niter = 100
# nfeatures = len(ewms)
# # Track the cross-validated mean TPR and precision for each feature set
# random_tprs = []
# random_precs = []
# # Keep track of the features selected for each round
# random_ewms = []

# np.random.seed(seed)
# for i in range(niter):
#     if i % 10 == 9:
#         print(f&quot;Iteration {i+1}&quot;)
        
#     # Randomly sample PWMs
#     sample = hocomoco.sample(nfeatures)
#     random_ewms.append(sample.index.str.split(&quot;_&quot;).str[0].values)
#     # Do predicted occupancy scan
#     features = predicted_occupancy.all_seq_total_occupancy(modeling_seqs, sample, mu, convert_ewm=False)
#     # Fit the model
#     clf = LogisticRegression(C=c_opt)
#     clf, tpr, prec, f1 = modeling.train_estimate_variance(clf, cv, features, labels_with_silencer, xaxis, positive_cutoff=0)
    
#     # Store the result
#     random_tprs.append(np.mean(tpr, axis=0))
#     random_precs.append(np.mean(prec, axis=0))
    
# fig, ax_list = plot_utils.setup_multiplot(2, n_cols=2, sharex=False, sharey=False)
# niter_rand = len(random_occ_tprs)
# rand_tpr_plotting = [[j] for i, j in random_occ_tprs.iterrows()] + [occ_tpr_cv]
# rand_prec_plotting = [[j] for i, j in random_occ_precs.iterrows()] + [occ_prec_cv]
# rand_names = [&quot;&quot;]  * niter_rand + [&quot;True features&quot;]
# rand_colors = [&quot;#8080801A&quot;] * niter_rand + [&quot;#E69B04&quot;]

# _, background_aurocs, _, background_auprs, _ = plot_utils.roc_pr_curves(
#     modeling_xaxis, rand_tpr_plotting, rand_prec_plotting, rand_names, model_colors=rand_colors,
#     prc_chance=prc_chance, figax=([fig, fig], ax_list)
# )

# plot_utils.add_letter(ax_list[0], -0.15, 1.03, &quot;c&quot;)
# plot_utils.add_letter(ax_list[1], -0.15, 1.03, &quot;d&quot;)

# # KS test, null hypothesis: random AUROCs and AUPRs are normally distributed
# # One-tailed Z-test that the real data is drawn from this distribution
# for data, name in zip([background_aurocs, background_auprs], [&quot;AUROC&quot;, &quot;AUPR&quot;]):
#     real, rand = data[niter_rand], data[:niter_rand]
#     dstat, pval = stats.kstest(stats.zscore(rand), &quot;norm&quot;)
#     print(f&quot;{name}s of random features are normally distributed, KS test p = {pval:.2f}, D = {dstat:.2f}&quot;)
#     zscore = (real - np.mean(rand)) / np.std(rand)
#     pval = stats.norm.cdf(-np.abs(zscore))
#     print(f&quot;Probability that the {name} of the real features is drawn from the background distribution, one-tailed Z-test p = {pval:2f}&quot;)

# display(fig)
# plt.close()</code></pre>
          </stencila-code-chunk>
          <figcaption>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Interactive version of
              panels (<strong itemscope="" itemtype="http://schema.stenci.la/Strong">c</strong>) and
              (<strong itemscope="" itemtype="http://schema.stenci.la/Strong">d</strong>). Note that
              this takes close to an hour to run.</p>
          </figcaption>
        </figure>
        <h3 itemscope="" itemtype="http://schema.stenci.la/Heading"
          id="strong-enhancers-are-characterized-by-diverse-total-motif-content">Strong enhancers
          are characterized by diverse total motif content</h3>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">To understand how these eight
          TF motifs differentiate strong enhancers from silencers, we first calculated the total
          predicted occupancy of each sequence by all eight lineage-defining TFs and compared the
          different activity classes. Strong enhancers and silencers both have higher total
          predicted occupancies than inactive sequences, but total predicted occupancies do not
          distinguish strong enhancers and silencers from each other (<a href="#fig2" itemscope=""
            itemtype="http://schema.stenci.la/Link">Figure 2b</a>, <a href="#supp5" itemscope=""
            itemtype="http://schema.stenci.la/Link">Supplementary file 5</a>). Since strong
          enhancers are enriched for several motifs relative to silencers, this suggests that strong
          enhancers are distinguished from silencers by the diversity of their motifs, rather than
          the total number.</p>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">We considered two hypotheses
          for how the more diverse collection of motifs function in strong enhancers: either strong
          enhancers depend on specific combinations of TF motifs (‘TF identity hypothesis’) or they
          instead must be co-occupied by multiple lineage-defining TFs, regardless of TF identity
          (‘TF diversity hypothesis’). To distinguish between these hypotheses, we examined which
          specific motifs contribute to the total motif content of strong enhancers and silencers.
          We considered motifs for a TF present in a sequence if the TF predicted occupancy was
          above 0.5 molecules (<a href="#supp4" itemscope=""
            itemtype="http://schema.stenci.la/Link">Supplementary file 4</a>), which generally
          corresponds to at least one motif with a relative <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">K</em><sub itemscope=""
            itemtype="http://schema.stenci.la/Subscript">D</sub> above 3%. This threshold captures
          the effect of low affinity motifs that are often biologically relevant <span itemscope=""
            itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib10"><span>10</span><span>Crocker
                  et al.</span><span>2015</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib15"><span>15</span><span>Farley
                  et al.</span><span>2015</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib16"><span>16</span><span>Farley
                  et al.</span><span>2016</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib63"><span>63</span><span>Parker
                  et al.</span><span>2011</span></a></cite></span>. As expected, 97% of strong
          enhancers and silencers contain CRX motifs since the sequences were selected based on CRX
          binding or significant matches to the CRX PWM within open chromatin (<a href="#fig2"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 2c</a>). Compared to
          silencers, strong enhancers contain a broader diversity of motifs for the eight
          lineage-defining TFs (<a href="#fig2" itemscope=""
            itemtype="http://schema.stenci.la/Link">Figure 2c</a>). However, while strong enhancers
          contain a broader range of motifs, no single motif occurs in a majority of strong
          enhancers: NRL motifs are present in 23% of strong enhancers, NeuroD1 and RORB in 18%
          each, and MAZ in 16%. Additionally, none of the motifs tend to co-occur as pairs in strong
          enhancers: no specific pair occurred in more than 5% of sequences (<a href="#fig2"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 2d</a>). We also did not
          observe a bias in the linear arrangement of motifs in strong enhancers (Materials and
          methods). Similarly, no single motif occurs in more than 15% of silencers (<a href="#fig2"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 2c</a>). These results
          suggest that strong enhancers are defined by the diversity of their motifs, and not by
          specific motif combinations or their linear arrangement.</p>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">The results above predict that
          strong enhancers are more likely to be bound by a diverse but degenerate collection of
          TFs, compared with silencers or inactive sequences. We tested this prediction by examining
          in vivo TF binding using published ChIP-seq data for NRL <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib23"><span>23</span><span>Hao et
                al.</span><span>2012</span></a></cite> and MEF2D <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib2"><span>2</span><span>Andzelm et
                al.</span><span>2015</span></a></cite>. Consistent with the prediction, sequences
          bound by CRX and either NRL or MEF2D are approximately twice as likely to be strong
          enhancers compared to sequences only bound by CRX (<a href="#fig2" itemscope=""
            itemtype="http://schema.stenci.la/Link">Figure 2e</a>). Sequences bound by all three TFs
          are the most likely to be strong or weak enhancers rather than silencers or inactive
          sequences. However, most strong enhancers are not bound by either NRL or MEF2D (<a
            href="#fig2" itemscope="" itemtype="http://schema.stenci.la/Link">Figure 2f</a>),
          indicating that binding of these TFs is not required for strong enhancers. Our results
          support the TF diversity hypothesis: CRX-targeted enhancers are co-occupied by multiple
          TFs, without a requirement for specific combinations of lineage-defining TFs.</p>
        <h3 itemscope="" itemtype="http://schema.stenci.la/Heading"
          id="strong-enhancers-have-higher-motif-information-content-than-silencers">Strong
          enhancers have higher motif information content than silencers</h3>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Our results indicate that both
          strong enhancers and silencers have a higher total motif content than inactive sequences,
          while strong enhancers contain a more diverse collection of motifs than silencers. To
          quantify these differences in the number and diversity of motifs, we computed the
          information content of CRX-targeted sequences using Boltzmann entropy. The Boltzmann
          entropy of a system is related to the number of ways the system’s molecules can be
          arranged, which increases with either the number or diversity of molecules (<cite
            itemscope="" itemtype="http://schema.stenci.la/Cite"><a
              href="#bib67"><span>67</span><span>Phillips et al.</span><span>2012</span></a></cite>,
          Chapter 5). In our case, each TF is a different type of molecule and the number of each TF
          is represented by its predicted occupancy for a <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">cis</em>-regulatory sequence. The number of
          molecular arrangements is thus <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">W</em>, the number of distinguishable
          permutations that the TFs can be ordered on the sequence, and the information content of a
          sequence is then log<sub itemscope="" itemtype="http://schema.stenci.la/Subscript"><span
              data-itemtype="http://schema.org/Number">2</span></sub><em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">W</em> (Materials and methods).</p>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">We found that on average,
          strong enhancers have higher information content than both silencers and inactive
          sequences (Mann-Whitney U test, p = 1 × 10<sup itemscope=""
            itemtype="http://schema.stenci.la/Superscript">–23</sup> and 7 × 10<sup itemscope=""
            itemtype="http://schema.stenci.la/Superscript">–34</sup>, respectively, <a href="#fig3"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 3a</a>, <a href="#supp5"
            itemscope="" itemtype="http://schema.stenci.la/Link">Supplementary file 5</a>),
          confirming that information content captures the effect of both the number and diversity
          of motifs. Quantitatively, the average silencer and inactive sequence contains 1.6 and 1.4
          bits, respectively, which represents approximately three total motifs for two TFs. Strong
          enhancers contain on average 2.4 bits, representing approximately three total motifs for
          three TFs or four total motifs for two TFs. To compare the predictive value of our
          information content metric to the model based on all eight motifs, we trained a logistic
          regression model and found that information content classifies strong enhancers from
          silencers with an AUROC of 0.634 ± 0.008 and an AUPR of 0.663 ± 0.014 (<a href="#fig3"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 3b</a> and <a href="#fig3"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 3—figure supplement 1</a>).
          This is only slightly worse than the model trained on eight TF occupancies despite an
          eightfold reduction in the number of features, which is itself comparable to the SVM with
          2080 features. The difference between the two logistic regression models suggests that the
          specific identities of TF motifs make some contribution to the eight TF model, but that
          most of the signal captured by the SVM can be described with a single metric that does not
          assign weights to specific motifs. Information content also distinguishes strong enhancers
          from inactive sequences (AUROC 0.658 ± 0.012, AUPR 0.675 ± 0.019, <a href="#fig3"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 3b</a> and <a href="#fig3"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 3—figure supplement 1</a>).
          These results indicate that strong enhancers are characterized by higher information
          content, which reflects both the total number and diversity of motifs.</p>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig3"
          title="Figure 3 and Figure 3—figure supplement 1."><label data-itemprop="label">Figure 3
            and Figure 3—figure supplement 1.</label>
          <stencila-code-chunk itemscope="" itemtype="http://schema.stenci.la/CodeChunk"
            data-execution_count="17" data-programminglanguage="python">
            <pre class="language-python" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"
              slot="text"><code># Fit logistic regression models
entropy_clf = LogisticRegression()
entropy_clf, entropy_tpr_list, entropy_prec_list, entropy_f1_list = modeling.train_estimate_variance(entropy_clf, cv, wt_entropy_df.loc[silencer_modeling_mask, &quot;entropy&quot;], labels_with_silencer, xaxis, positive_cutoff=0)

inactive_entropy_clf = LogisticRegression()
inactive_entropy_clf, inactive_entropy_tpr_list, inactive_entropy_prec_list, inactive_entropy_f1_list = modeling.train_estimate_variance(inactive_entropy_clf, cv, wt_entropy_df.loc[inactive_modeling_mask, &quot;entropy&quot;], labels_with_inactive, xaxis, positive_cutoff=0)

# Setup figures
fig, ax_list = plot_utils.setup_multiplot(2, sharex=False, sharey=False)
fig_pr, ax_pr = plt.subplots()

# 3a: violin plot of information content
print(&quot;Information content for each class:&quot;)
display(wt_entropy_grouper[&quot;entropy&quot;].describe())

ax = ax_list[0]
fig = plot_utils.violin_plot_groupby(wt_entropy_grouper[&quot;entropy&quot;], &quot;Information content&quot;, class_names=wt_activity_names_oneline, class_colors=color_mapping, figax=(fig, ax))
plot_utils.rotate_ticks(ax.get_xticklabels())
ax.set_yticks(np.arange(0, wt_entropy_df[&quot;entropy&quot;].max() + 1, 2))
plot_utils.add_letter(ax, -0.2, 1.03, &quot;a&quot;)

# Add ticks above to show the n
ax_twin = ax.twiny()
ax_twin.set_xticks(ax.get_xticks())
ax_twin.set_xlim(ax.get_xlim())
ax_twin.set_xticklabels(wt_activity_count, fontsize=10, rotation=45)

# Statistics for differences in information content
ustat, pval = stats.mannwhitneyu(wt_entropy_grouper[&quot;entropy&quot;].get_group(&quot;Strong enhancer&quot;), wt_entropy_grouper[&quot;entropy&quot;].get_group(&quot;Silencer&quot;), alternative=&quot;two-sided&quot;)
print(f&quot;Strong enhancers and silencers have the same information content, Mann-Whitney U test p = {pval:.0e} U = {ustat:.2f}&quot;)
ustat, pval = stats.mannwhitneyu(wt_entropy_grouper[&quot;entropy&quot;].get_group(&quot;Strong enhancer&quot;), wt_entropy_grouper[&quot;entropy&quot;].get_group(&quot;Inactive&quot;), alternative=&quot;two-sided&quot;)
print(f&quot;Strong enhancers and inactive sequences have the same information content, Mann-Whitney U test p = {pval:.0e}, U = {ustat:.2f}&quot;)

# 3b: ROC and PR curves with information content vs. two classes
model_data = [
    (entropy_tpr_list, entropy_prec_list, &quot;Strong vs.\nsilencer&quot;, &quot;#E69B04&quot;),
    (inactive_entropy_tpr_list, inactive_entropy_prec_list, &quot;Strong vs.\ninactive&quot;, plot_utils.set_color(1))
]

model_tprs, model_precs, model_names, model_colors = zip(*model_data)
ax = ax_list[1]

# Plot the models
_, model_aurocs, model_aurocs_std, model_auprs, model_auprs_std = plot_utils.roc_pr_curves(
    modeling_xaxis, model_tprs, model_precs, model_names, model_colors=model_colors,
    figax=([fig, fig_pr], [ax, ax_pr])
)
ax.set_xticks(np.linspace(0, 1, 6))
plot_utils.add_letter(ax, -0.2, 1.03, &quot;b&quot;)

# Display model metrics
print(&quot;Model metrics:&quot;)
for name, auroc, auroc_std, aupr, aupr_std in zip(model_names, model_aurocs, model_aurocs_std, model_auprs, model_auprs_std):
    print(f&quot;{name}\tAUROC={auroc:.3f}+/-{auroc_std:.3f}\tAUPR={aupr:.3f}+/-{aupr_std:.3f}&quot;)
    
print(&quot;Figure 3:&quot;)
fig.tight_layout()
display(fig)
print(&quot;Figure 3--figure supplement 1:&quot;)
display(fig_pr)
plt.close()
plt.close()</code></pre>
            <figure slot="outputs">
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Information content for each class:
</code></pre>
              <table itemscope="" itemtype="http://schema.org/Table">
                <thead>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">count</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">mean</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">std</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">min</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">25%</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">50%</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">75%</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">max</th>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell">group_name_WT</th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                    <th itemscope="" itemtype="http://schema.stenci.la/TableCell"></th>
                  </tr>
                </thead>
                <tbody>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Silencer</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">837</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.554721</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.872824</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.000173</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.195721</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.952877</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.240308</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">15.248629</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Inactive</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">928</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.385812</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.646322</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.000105</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.150796</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.841681</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.050814</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">14.738741</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Weak enhancer</td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1360</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.49678</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.683849</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.000008</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.201747</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.014613</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.216628</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">17.960698</span></td>
                  </tr>
                  <tr itemscope="" itemtype="http://schema.stenci.la/TableRow">
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell">Strong enhancer
                    </td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1051</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.383258</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">2.1786</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.000173</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">0.635291</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">1.836731</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">3.453384</span></td>
                    <td itemscope="" itemtype="http://schema.stenci.la/TableCell"><span
                        data-itemtype="http://schema.org/Number">13.082139</span></td>
                  </tr>
                </tbody>
              </table>
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Strong enhancers and silencers have the same information content, Mann-Whitney U test p = 1e-23 U = 557959.00
Strong enhancers and inactive sequences have the same information content, Mann-Whitney U test p = 7e-34, U = 641607.00
Model metrics:
Strong vs.
silencer	AUROC=0.634+/-0.008	AUPR=0.663+/-0.014
Strong vs.
inactive	AUROC=0.658+/-0.012	AUPR=0.675+/-0.019
Figure 3:
</code></pre><img src="index.html.media/17" alt="" itemscope=""
                itemtype="http://schema.org/ImageObject">
              <pre class="language-text" itemscope="" itemtype="http://schema.stenci.la/CodeBlock"><code>Figure 3--figure supplement 1:
</code></pre><img src="index.html.media/18" alt="" itemscope=""
                itemtype="http://schema.org/ImageObject">
            </figure>
          </stencila-code-chunk>
          <figcaption>
            <h4 itemscope="" itemtype="http://schema.stenci.la/Heading"
              id="figure-3-information-content-classifies-strong-enhancers">Figure 3: Information
              content classifies strong enhancers.</h4>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">(<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">a</strong>) Information content for
              different activity classes. (<strong itemscope=""
                itemtype="http://schema.stenci.la/Strong">b</strong>) Receiver operating
              characteristic of information content to classify strong enhancers from silencers
              (orange) or inactive sequences (indigo).</p>
            <h4 itemscope="" itemtype="http://schema.stenci.la/Heading"
              id="figure-3figure-supplement-1-precision-recall-curve-of-logistic-regression-classifier-using-information-content">
              Figure 3—figure supplement 1: Precision recall curve of logistic regression classifier
              using information content.</h4>
            <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Orange, strong enhancer vs.
              silencer; indigo, strong enhancer vs. inactive; shaded area, 1 standard deviation
              based on fivefold cross-validation.</p>
          </figcaption>
        </figure>
        <h3 itemscope="" itemtype="http://schema.stenci.la/Heading"
          id="strong-enhancers-require-high-information-content-but-not-nrl-motifs">Strong enhancers
          require high information content but not NRL motifs</h3>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">Our results show that except
          for CRX, none of the lineage-defining motifs occur in a majority of strong enhancers.
          However, all sequences were tested in reporter constructs with the <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">Rho</em> promoter, which contains an NRL
          motif and three CRX motifs <span itemscope=""
            itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib9"><span>9</span><span>Corbo et
                  al.</span><span>2010</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a
                href="#bib47"><span>47</span><span>Kwasnieski et
                  al.</span><span>2012</span></a></cite></span>. Since NRL is a key co-regulator
          with CRX in rod photoreceptors, we tested whether strong enhancers generally require NRL,
          which would be inconsistent with our TF diversity hypothesis. We removed the NRL motif by
          recloning our MPRA library without the basal <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">Rho</em> promoter. If strong enhancers
          require an NRL motif for high activity, then only CRX-targeted sequences with NRL motifs
          will drive reporter expression. If information content (i.e. total motif content and
          diversity) is the primary determinant of strong enhancers, only CRX-targeted sequences
          with sufficient motif diversity, measured by information content, will drive reporter
          expression regardless of whether or not NRL motifs are present.</p>
        <p itemscope="" itemtype="http://schema.stenci.la/Paragraph">We replaced the <em
            itemscope="" itemtype="http://schema.stenci.la/Emphasis">Rho</em> promoter with a
          minimal 23 bp polylinker sequence between our libraries and <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">DsRed</em>, and repeated the MPRA (<a
            href="#fig1s1" itemscope="" itemtype="http://schema.stenci.la/Link">Figure 1—figure
            supplement 1</a>, <a href="#supp3" itemscope=""
            itemtype="http://schema.stenci.la/Link">Supplementary file 3</a>). CRX-targeted
          sequences were designated as ‘autonomous’ if they retained activity in the absence of the
          <em itemscope="" itemtype="http://schema.stenci.la/Emphasis">Rho</em> promoter (log<sub
            itemscope="" itemtype="http://schema.stenci.la/Subscript"><span
              data-itemtype="http://schema.org/Number">2</span></sub>(RNA/DNA) &gt; 0, Materials and
          methods). We found that 90% of autonomous sequences are from the enhancer class, while
          less than 3% of autonomous sequences are from the silencer class (<a href="#fig4"
            itemscope="" itemtype="http://schema.stenci.la/Link">Figure 4a</a>). This confirms that
          the distinction between silencers and enhancers does not depend on the <em itemscope=""
            itemtype="http://schema.stenci.la/Emphasis">Rho</em> promoter, which is consistent with
          our previous finding that CRX-targeted silencers repress other promoters <span
            itemscope="" itemtype="http://schema.stenci.la/CiteGroup"><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib32"><span>32</span><span>Hughes
                  et al.</span><span>2018</span></a></cite><cite itemscope=""
              itemtype="http://schema.stenci.la/Cite"><a href="#bib86"><span>86</span><span>White et
                  al.</span><span>2016</span></a></cite></span>. However, while most autonomous
          sequences are enhancers, only 39% of strong enhancers and 9% of weak enhancers act
          autonomously. Consistent with a role for information content, autonomous strong enhancers
          have higher information content (Mann-Whitney U test p = 4 × 10<sup itemscope=""
            itemtype="http://schema.stenci.la/Superscript">–8</sup>, <a href="#fig4" itemscope=""
            itemtype="http://schema.stenci.la/Link">Figure 4b</a>) and higher predicted CRX
          occupancy (Mann-Whitney U test p = 9 × 10<sup itemscope=""
            itemtype="http://schema.stenci.la/Superscript">–12</sup>, <a href="#fig4" itemscope=""
            itemtype="http://schema.stenci.la/Link">Figure 4c</a>) than non-autonomous strong
          enhancers. We found no evidence that specific lineage-defining motifs are required for
          autonomous activity, including NRL, which is present in only 25% of autonomous strong
          enhancers (<a href="#fig4" itemscope="" itemtype="http://schema.stenci.la/Link">Figure
            4d</a>). Similarly, NRL ChIP-seq binding <cite itemscope=""
            itemtype="http://schema.stenci.la/Cite"><a href="#bib23"><span>23</span><span>Hao et
                al.</span><span>2012</span></a></cite> occurs more often among autonomous strong
          enhancers (41% vs. 19%, Fisher’s exact test p = 2 × 10<sup itemscope=""
            itemtype="http://schema.stenci.la/Superscript">–14</sup>, odds ratio = 3.0), yet NRL
          binding still only accounts for a minority of these sequences. We thus conclude that
          strong enhancers require high information content, rather than any specific
          lineage-defining motifs.</p>
        <figure itemscope="" itemtype="http://schema.stenci.la/Figure" id="fig4" title="Figure 4.">
          <label data-itemprop="label">Figure 4.</label>
          <stencila-code-chunk itemscope="" itemtype=