# Colored de Bruijn Graphs

I have really wanted to write this post for a long time, but seem to only now get around to it. For more than a year now my research in the Computational Sciences Lab (CSL) at Brigham Young University (BYU) we have been researching various applications of the Colored de Bruijn Graph (CdBG). It all started when we explored some novel phylogenetic reconstruction methods in the CS 601R class during Winter semester 2017. We (or at least, I) kept being drawn back to CdBG’s and their potential for phylogeny reconstruction. Here are some of the things that I have learned along the way!

As with most scientific endeavors, this project certainly stands on the shoulders of giants. Some of these giants include the following papers and their respective authors. I think that they have done amazing work and I admire their methods.

## Motivation

We want to use the CdBG to reconstruct phylogenetic trees because it is very efficient computationally. The CdBG can be constructed in $$O(n)$$ time and space and it can utilize whole genome sequences, which is a shortcoming of many of the traditional phylogenetic tree reconstruction algorithms.

Furthermore, we also figured that the CdBG contains more information than many of the kmer counting methods, and if they can perform so well then the CdBG will only be able to perform better because it not only stored the kmers (as nodes in the graph), but it also stores the context in which those kmers occur (as edges where $$k - 1$$ basepairs overlap on either end of the kmer).

## Our Contribution

### kleuren

In order to prove our hypothesis, we did what every self-respecting Computer Scientist would do, we wrote a program to figure out if it worked. We call our program kleuren, which is Dutch for “colors” (referring to the colors in the CdBG).

kleuren works by finding bubble regions of the CdBG. A bubble is defined as a subgraph of the CdBG that consists of a start and end node that are present in $$n$$ or more colors, and there are multiple paths connecting the start node to the end node; where $$n$$ is a given parameter and is no greater than the total number of colors in the CdBG and a path is simply a traversal from one node to another.

After the bubbles are found, they are aligned through Multiple Sequence Alignment (MSA) via MAFFT and then each MSA block is concatenated to form a supermatrix. The supermatrix is then fed into a Maximum-Likelihood program (IQ-TREE) to reconstruct the phylogenetic tree of the taxa.

### How Bubbles are Found

kleuren uses fairly simple and straightforward algorithms to find the bubbles, which is broken up into two steps: Finding the End Node and Extending the Paths.

#### Finding the End Node

kleuren iterates over the super-set (the union of all kmers from all taxa) as potential start nodes (in a dBG the nodes are kmers, thus $$node == kmer$$). Given a kmer, it is queried in the CdBG and the number of taxa (or colors, thus $$taxon == color$$) is calculated to determine if the number of colors for that kmer, $$c$$, is greater than or equal to $$n$$, where $$n$$ is a parameter provided by the user.

If $$c \geq n$$ then the node is a valid start node and a breadth-first search is performed starting from this node until another node is found where the number of colors that it contains is greater than or equal to $$n$$, which then becomes the end node.

#### Extending the Paths

After an end node is discovered, the sequence of each path between the start and end nodes must be calculated. In order to discover a path in a dBG, one must collapse edges by appending the last nucleotide of the next node to the previous node’s sequence. For example, if a node is ACT -> CTG, then the collapsed sequence will turn out to be ACTG.

This is implemented as at most $$c$$ depth-first searches, where $$c$$ is the number of colors. The number of depth-first searches decreases as the number of paths with shared colors increases.

If you are interested in the details of our algorithm and would like to see some results, please check out our paper Whole Genome Phylogenetic Tree Reconstruction Using Colored de Bruijn Graphs (preprint). We are currently working on extending kleuren to improve its efficiency.

# Genome Mapping Post Processing

So you have mapped your reads to the reference genome, but what comes next? How can you tell how many reads were aligned, where they were aligned, and actually see what your mapping algorithm did? This post will show you how you can analyze the results of your genome mapper by using samtools and IGV.

# samtools

## SAM File Format

Most modern genome aligners will output the aligned reads in the SAM file format. If you have a lot of time on your hands you can read through this SAM file and see where the reads are mapped to and read the information for each read (which probably adds up to millions of lines). A’int nobody got time for that. Instead, we are going to have samtools do the work for us.

## Installation

2. Go to the directory that you downloaded it and unzip the file
• $cd ~/Downloads • $ tar jxvf samtools-1.3.1.tar.bz2
3. Go to the upzipped directory and prepare for install
• $cd samtools-1.3.1 • $ ./configure
• Note: You may get a library dependency error if you don’t have the development files for the ncurses library installed. If you do get this error, install either libncurses5-dev or ncurses-devel.
4. If the ./configure command executed without errors, install samtools
• $sudo make install • Note: If you do not have super-user privileges then you can run $ make install --prefix=<directory of your choice>
5. Test if the install worked by running samtools

Now we can finally view the aligned reads in our terminal! Run samtools tview examples/toy.sorted.bam examples/toy.fa and you will suddenly see four reads aligned to an extremely small genome. You should see something along these lines: 1 11 21 31 41 51 AGCATGTTAGATAA****GATA**GCTGTGCTAGTAGGCAG*TCAGCGCCATNNNNNNNN ........ .... ......K.K......K. .......... ........AGAG....***... ,,,,, ,,,,,,,,, ......GG**....AA ..C...**** ...**...>>>>>>>>>>>>>>T.....  The numbers at the top signify the index of the genome, and the first line of characters represents the reference genome itself. The third line is the (consensus sequence)https://en.wikipedia.org/wiki/Consensus_sequence to discover what K represents). Each line under the consensus sequence is a read. You may be wondering why most of the reads are made of ., well that is because they match the reference genome. There are many different settings that you can play with in samtools tview, to view all of the settings type ? and a help menu will come appear. # Integrative Genome Viewer (IGV) If you want more flexibility and a more robust way of viewing your aligned reads you can use IGV. It has a GUI, which makes things nice sometimes. ## Installation Download the appropriate files according to the system that you are running. IGV is written in Java (using Java 7, not sure if Java 89 will work, but it should), so you need to make sure that you have Java installed on your computer. Once you have Java installed and IGV downloaded, go ahead and unzip the downloaded file, if needed. Then if you are in an Unix-like OS (Mac or Linux) you can run  ./IGV_2.3.88/igv.sh to open up the IGV GUI.

Once the program is open, click the Genomes button at the top of the window, then select Load Genome from File.... Select the file samtools-1.3.1/examples/toy.fa.

After the reference genome is loaded we can load in the reads by clicking the File button at the top of the window, then select Load from File.... Select the file samtools-1.3.1/examples/toy.sorted.bam.

## Seeing the Reads

You probably can’t see any changes in the view, that’s ok. In the second row click on the dropdown arrow that says All and select either ref or ref2 and then you will be able to see what we saw in samtools tview, except it is way easier to figure out what everything means!

# Suffix Trees

## What is a Suffix Tree?

A suffix tree is a data sructure that contains each suffix of a string, a suffix is defined as the end part of a string. For example, the suffixes for the string banana are:

• a
• na
• ana
• nana
• anana
• banana

Notice that the entire string is also considered a suffix. We can also include the empty string as a suffix, so in order to include the empty string we need to append a character that isn’t in the alphabet to the string. We will assume that the character $ is not in the alphabet. Our string is now banana$, and the suffixes are:

• $• a$
• na$• ana$
• nana$• anana$
• banana$## How to Construct a Suffix Tree The easiest way to understand how to construct a suffix tree is to first construct a suffix trie, then collapse nodes to convert the trie to a tree. Here is the graphical representation of the suffix trie: Now we collapse the nodes that only have one child, and the resulting suffix tree is this: ## How to Use a Suffix Tree Observe that each suffix is represented in this structure. This means that we can efficiently search for a suffix (or substring) by traversing the suffix tree. For example if we were to search for the substring ana in the suffix tree for banana$, this would be the traversal path:

0 -> 2 -> 3

Note: We can stop at any node because we are searching for a substring rather than a suffix, if we were searching for a suffix, the search string would have been ana$. ## Suffix Tree Construction How would you create a naive algorithm to construct a suffix tree? You can easily create a simple algorithm to construct a suffix tree in O(n2), but Esko Ukkonen discovered a way to construct a suffix tree in O(n) (linear time). # Genome Assembly # What is Genome Assembly? Genome assembly is often compared to putting a puzzle together. In this analogy, the pieces of the puzzle are individual reads that we get from a DNA sequencer. The ultimate goal of putting a puzzle together is to find out where each and every piece fit exactly, this way you can see the completed picture. While this is also true of genome assembly, we need to be realistic. The first step in genome assembly is to generate contigs, or in the words of our puzzle analogy, to combine pieces that we know go together. ## What is a Contig? A contig is short for contiguous sequence. It is a sequence that is longer than the reads, and shorter than the genome (technically the whole assembled genome could be considered a contig in an organism with a single chromosome, but for our purposes it is considered to be shorter than the entire genome). If you have a puzzle that is a picture of a beach and a ocean, you would combine all of the tan colored pieces and then all of the blue colored pieces. You may even combine the pieces that make up the colorful beach umbrellas littering the beach. When you do this you are generating contigs! You are making small chunks of the whole because they are easy to identify, and it is the same with genome assembly. ## How Do We Discover Contigs? Well, we know that the pieces of our puzzle are reads. The regular puzzle piece has four sides with different shapes that define exactly where it should go. Our reads also have “sides” that tell us the proper place it should be in the contig. These “sides” are revealed when we break the read up into kmers and inserting them into a de Bruijn graph. We then can construct the contigs by continuing the non-branching nodes (a node that has exactly one incoming edge and one outgoing edge). Imagine that we are assembling an incredibly small genome, with way too few reads, which are the following: ACTGT TCTGT CTGTT CTGTA GCATA TGTTA GTTAC  The reads are of length 5. The de Bruijn graph using the kmer length of 5 (the entire read) for this set of reads would look like this (the non-branching node is in red): As you can see there is only one non-branching node in the graph. We can only generate the following contigs: ACTGT TCTGT CTGTTA CTGTA GCATA TGTTA GTTAC  I hope it is clear that this is not a very good assembly. Why isn’t this assembly good? Well for the most part all of our contigs are the same length of our reads, except for one, which is only one more base pair than the read length. This assembly is equivalent to only connecting one puzzle piece. Let’s see how we can improve this assembly. ### Kmer Length The length of kmer that you use to construct your de Bruijn graph will greatly influence your assembly, for better or for worse. What will happen if we decrease the kmer length from 5 to 3? Here is the de Bruijn graph of kmer length 3: Here is the de Bruijn graph of kmer size 3: What is up with all of the edges? Each edge represents an occurance of that kmer, for example the kmer ACT has 4 edges to CTG because there are 4 occurances of CTG. Let’s clean this graph up by giving weights to the edges. The number on each edge represents how many times that edge is repeated. Here is the de Bruijn graph of kmer size 3 with weighted edges (the non-branching nodes are in red): Let’s see if a kmer length of 3 is any better; we definitely have more branhcing nodes, but will that lead to longer contigs? Essentially we have made our puzzle pieces smaller, so just because we can put together more puzzle pieces doesn’t mean that we are building more of the puzzle. Here are the contigs generated by the graph of kmer length 3: ACT TCT CTG TGTTA TGTA GCATA TAC  Well, compared to the assembly using kmers of length 5 we have the same number of contigs, with a shorter average, and the longest contig is only the length of the reads. Is this assembly better or worse than the last one? Short answer, yes. You have to define “better” when classifying assemblies. You may want to have the longest average contig length, or simply the longest contig. You may have some other metric like N50 in which you determine which assembly is “better.” ## Filtering Errors One way of accounting for errors in your algorithm is to remove edges below a certain weight. When an edge has a higher value, it occurs more often. If an edge occurs more often, then it is more likely to be a valid contig. # What is Next? You may be asking yourself, “I can see how we generate contigs, and how that can be useful, but we still don’t have one sequence that represents the genome. We haven’t fully completed the puzzle. How do we do that?” If you would like to fully complete the puzzle, to put all of these contigs together, you would have to construct scaffolds. A scaffold is a representation of how contigs relate to each other as well as accounting for gaps in the sequence. Depending on how much data you have, you may not be able to create one continuous sequence. NOTE: Scaffolding is not required for the CS 418/BIO 365 Genome Assembler project. Generating contigs is good enough! # How to Assemble using Velvet # Genome Assembly using Velvet ## What the heck is Genome Assembly? Genome assembly is the process of constructing long contiguous sequences from shorter sequences. Think of this problem at a genomic scale. Same approach, just a lot more data. ## What the heck is Velvet? Velvet is a genome assembler that uses a de Bruijn graph to generate contigs. If you are interested in reading the paper describing how Velvet works, feel free to read Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. ### Installing Velvet I say that the most important part of using software is figuring out how to install it. Sometimes it can be harder than you think. Here is how you install Velvet: 1. Download the source • Optional: Check out the sweet Velvet website complete with web 2.0 design. Al Gore would be proud. 2. Go to the directory in which you downloaded the file $ cd ~/Downloads and unzip the file $tar zxvf velvet_1.2.10.tgz 3. Go to the just unzipped directory $ cd velvet_1.2.10 and compile Velvet by issuing the command $make • Error warning!! If you get an error that says something along the lines of fatal error: zlib.h: NO such file or directory then try installing the package zlib1g-dev then running $ make again
4. If you didn’t get any errors, it looks like you have installed Velvet! If you got errors, Google the error and figure out how to fix it.

### Running Velvet

To execute the Velvet program make sure that you are in the velvet_1.2.10 directory and then type $./velveth and it should return a short help message. If it didn’t, check to see if you are in the correct directory by issuing the command $ pwd.

# Assembling the Zika virus genome

We have some reads from the Zika virus, fresh from Florida. We want to assemble the Zika virus genome to help find a cure. Download the reads zika.read1.fastq and zika.read2.fastq, then run this command $./velveth zika_genome 20 -fastq -shortPaired ~/Downloads/zika.read1.fastq ~/Downloads/zika.read2.fastq. This command is a sort of preprocessing command that constructs your dataset so that it can assemble it. Here are what the parameters mean: • ./velveth- the program that we use • zika_genome- this is the output directory of all the files • 20- this is the hash (in other words, kmer) size that we use, you will want to play around with this • -fastq- this is the type of input files that we have • -shortPaired- this is the type of input reads that we have • ~/Downloads/zika.read1.fastq- this is the first file of reads • ~/Downloads/zika.read2.fastq- this is the second file of reads Note: You can have an unlimited number of input files. ## Assembling the reads We are now going to use the program ./velvetg to actually construct the de Bruijn graph and assemble the reads. Issue the command $ ./velvetg zika_genome/ -cov_cutoff 4 -min_contig_lgth 100, and now you have assembled your first genome! Here are what the parameters mean:

• ./velvetg- the program that we use
• -cov_cutoff 4- this removes the nodes that have a coverage less than 4
• -min_contig_lgth 100- this gives us all of the contigs that are greater than 100 bases

## Viewing the generated contigs

Now we can see the contigs we generated by this command, $cd ./zika_genome and $ less contigs.fa. Feel free to explore around in this directory for other cool stuff about the contigs!

Happy assembling!

## Above and beyond…

You can compare your generated contigs with the NCBI Reference Sequence for the Zika virus to see how well (or how poorly) your genome assembly actually is!

# Python Tutorial

### 30 August 2016

# This is a comment

""" This is a
multi-
line
comment"""

''' This is also a
multi-
line
comment'''

# How do I declare variables?

# String
my_string = 'Hello, CS 418/BIO 365!!'
my_other_string = "Hello, friends."

# Numbers
my_number = 418
my_other_number = 365.578

# List
my_list = ['this', 'is', 'fun', 56]
# or
my_other_list = []
# or
my_other_list2 = list()

my_other_list.append('bioinformatics')

# Dictionary (Map)
my_dict = {'key': 'value', 'other_key': 'other_value'}
# or
my_other_dict = {}
# or
my_other_dict2 = dict()

my_other_dict['my_key'] = 'my_value'

# How do I make loops?

# NOTE: Whitespace matters!!

# for loop
for i in my_list:
print(i)

# NOTE: "print(i)" is used in Python 3, "print i" is used in Python 2

this
is
fun
56

# for loop over range
for i in range(0, 5):
my_other_list.append(i)
print(my_other_list)

['bioinformatics', 0, 1, 2, 3, 4]

# while loop
while len(my_list) > 0:
del my_list[-1]
print(my_list)

['this', 'is', 'fun']
['this', 'is']
['this']
[]

# break
for val in my_other_list:
if val is 2:
break
else:
print(val)

bioinformatics
0
1

# continue
for val in my_other_list:
if val is not 2:
continue
print(val)

2

# How do I perform file I/O?

# file input
import sys # this imports the library to get command line arguments
with open(sys.argv[1]) as fh: # sys.argv[1] is the first command line argument, sys.argv[2] is the second ... and so on.
my_first_line = next(fh) # Python 2 & 3
my_next_line = fh.next() # Python 2

with open(sys.argv[1]) as other_fh:
# iterate over all lines in file
for line in other_fh:
print(line)

ACGTTGCATGTCGCATGATGCATGAGAGCT

4

# file output
with open('./my_file.txt', 'w') as writable: ''' NOTE: the second
argument ('w') makes this file writable '''
writable.write('This is my test file\n')
writable.write(my_first_line)

# How do I manipulate strings?

# string slice
my_string = 'banana'
print(my_string[3])
print(my_string[0:5])
print(my_string[1:5])
print(my_string[-1])
print(my_string[0:-2])

a
banan
anan
a
bana


# Rosalind Tutorial

## The first problem is found here: http://rosalind.info/problems/ba1b/?class=322

# What do we need to do in pseudo code?

''' -Read in the sequence and the
length of the kmer (a kmer is essentially a substring)
-Break up the sequence into kmers
-Count each kmer
-Find the kmer(s) with the highest counts
-Print out the highest count kmers'''

# Read in the sequence and the length of the kmer from a file
import sys
with open(sys.argv[1]) as file:
seq = next(file).strip() # REMEMBER: in Python 2 use file.next()
kmer_len = int(next(file).strip())

# Break up the sequence into kmers
for i in range(0, len(seq)):
kmer = seq[ i : i + kmer_len]

# Count each kmer
counts = {}
for i in range(0, len(seq)):
kmer = seq[ i : i + kmer_len]
if kmer not in counts:
counts[kmer] = 1
else:
counts[kmer] += 1

# Find the kmer(s) with the highest counts
max_count = 0
max_kmers = []
for kmer, count in counts.items(): # NOTE: use counts.iteritems() in Python 2
if count > max_count:
max_count = count
max_kmers.clear()
max_kmers.append(kmer)
elif count == max_count:
max_kmers.append(kmer)

# Print out the kmers with the highest counts
print(" ".join(max_kmers))
`

# You Don't Know How Bad You Are Until You Try to Be Good

One interesting phenomenon that I have noticed in myself is that I perceive that I am better than I actually am in reality. Call it pride or high expectations; either way, it seems that I never realize where I was until I begin to improve. When we don’t try to improve ourselves, we will have nothing to compare ourselves to in the future. If we never surpass our current state, then we will never realize that we were ever lacking in our old state.

I have also noticed that the moment that we begin on the path of improvement it seems as if that path is impossibly difficult, so much more difficult than it was before. I believe that this is due to the fact that change is difficult, and as stated previously, I believe that my abilities are higher than they are in reality.

# Why try to improve?

Some may ask what the point is in striving to do your best? Is your current state not good enough? I can only speak for myself (obviously), so why do I strive to do my best? I see achieving your highest potential as the greatest challenge this life has to offer. Whether it be in sports, music, academics, or business, it is human nature to embrace difficulties and overcome them. I see overcoming weaknesses and flaws in my character as a challenge that is to be overcome, that I enjoy immensely.

# Why is improving so difficult?

You don’t know how hard improving is going to be until you attempt it. If our current state of being isn’t easy or comfortable, then we wouldn’t that way in the first place. It is much easier to stay the way you are than to fundamentally change your character; however, change for the better is always worth it, no matter how difficult it was. Doing what we have always done is definitely easier, but making the change that we want can bring the satisfaction back to life that we may be lacking.

-Cole

# Wait for it Or not

Wait for it… Or not.

# Wait for it… Or not.

When you can’t wait…

### When you can’t wait…

Source: http://www.consciousvanguard.com/blog/2015/10/22/patience-is-the-virtue

In a world where prayers to the ‘omniscient being of the universe’1 begin with OK Google, and more information than you will ever want to know are answered instantaneously. Furthermore, having the ability to obtain almost any product that one could desire with free 2 day shipping doesn’t help us learn how to wait either.

### What is the point of waiting?

What value does patience add to our lives? Learning how to wait is valuable because at some point in your life there will be something that you can’t receive/achieve instantaneously. There are many worthy endeavors that will require you to work long and hard, and if you aren’t patient enough to work to achieve those goals you will never achieve them.

Patience is arguably one of the most difficult attributes to develop. I believe the only way that you can develop patience is by waiting for it (get it?).

-Cole

Exported from Medium on August 7, 2016.

# Precision Medicine Rogue Therapeutics Harvard

Precision Medicine- Rogue Therapeutics Harvard 2016

# Precision Medicine- Rogue Therapeutics Harvard 2016

Rare disease, genomics and patient-driven medicine may be terms that you have never heard of. All of these terms relate to precision…

### Precision Medicine- Rogue Therapeutics Harvard 2016

Rare disease, genomics and patient-driven medicine may be terms that you have never heard of. All of these terms relate to precision medicine, medical treatments that are tailored to a specific patient.

These patients usually have some sort of rare disease, which is any disease that affects less than 200,000 people (in the United States). However, while the diseases themselves may be rare; having a rare disease is not rare. There are approximately 30 million people in the United States that suffer from a rare disease.

#### What is the purpose of precision medicine?

The goal of precision medicine is to develop treatments to alleviate rare diseases. The range of treatments available are quite limited because 80% of rare disease are genetic based. This means that while traditional medications could treat rare diseases, they will never permanently cure genetic disorders.

#### Undiagnosed Disease Network

Even though Precision Medicine as a discipline is still in its infant stages, there are many organizations that aim to make precision medicine available to every patient. One such organization is the Undiagnosed Disease Network (UDN). This is an initiative funded by the NIH to aid those with rare diseases to have a diagnosis. While only 50% of the current existing rare diseases have a foundation studying and supporting patients with that disease (let alone a cure or treatment for that disease), the UDN can help patients towards a diagnosis.

#### Karen, Ornella & Lysogene

Another example of an organization that supports rare diseases and have successfully produced one form of precision medicine (so far) is Lysogene. This company was founded by a mother, Karen Aiach, whose daughter, Ornella, was diagnosed with Sanfilippo Syndrome. At the time of diagnosis the doctors stated that the prognosis of Ornella was highly dysfunctional childhood development and an early death at the age of 20 (if not earlier).

“There will be no cure for 20 years, go home and enjoy the time left with your child.” — The doctors diagnosing Ornella

In spite of the doctors’ prognosis, Karen was determined to do something about her daughter’s health condition. She did research and talked to the right people and eventually brought a treatment for Sanfilippo Syndrome to clinical trials. Her daughter still lives, and has given hope to other patients with Sanfilippo Syndrome.

Unfortunately Karen and Ornella’s story is still rare when it comes to the rare disease community. There are thousands that die annually without any cure, or even any hope for a cure. This must change. As research progresses and discoveries about rare diseases are made, more cures will come. Karen, among others, are the pioneers of precision medicine. I hope for a time when all patients diagnosed with a rare disease will be as fortunate as Ornella.

You can read more about Karen and Ornella’s story here at labiotech.eu.

-Cole

Exported from Medium on August 7, 2016.

Honoring Pi Day

# Honoring Pi Day

Preface: This is a throwback from some years ago (circa 2012) when I was a Senior in high school, and was in honor of Pi Day. Enjoy!

### Honoring Pi Day

Preface: This is a throwback from some years ago (circa 2012) when I was a Senior in high school, and was in honor of Pi Day. Enjoy!

π It is romantic, infinite, enigmatic, and yet so simple. It can be discovered by taking the area of a circle, and dividing that by the square of its radius. It has fooled may intellectual men and women throughout the ages, and still today. I will do a little manipulation of π (and hopefully not be committing mathematical heresy) with the advent of π day.

As many know π can be known as 3.14159265… Which would mean that the most accurate π day would have been on March 14, 1592. So, going off this, if π changed every year according to the date, what mathematical effects would this bring?

To illustrate this I will use one of the most basic uses of π, finding the area of a circle. For ease of understanding, our circle will have a radius of 1. Reach back to third grade and remember the formula for the area of a circle A=π r2. Which means that if our radius (r)=1, then the area will be π or 3.141592.

Now let’s have some fun.

If we modify π to change every year, π would equal 3.142012 this year and next year it would equal 3.142013 and so on for eternity. The repercussions of this would be that circles with a radius of 1, would be 0.00042 units2 larger than the ‘true’ π day back in 1592. Even though this is a mere 0.013369018% increase in size and a 0.000001 unit increase per year, it could still be a pretty big deal (ok, not really). It would take 6,000 years, if this pattern is continued, for the ‘area’ of a circle to double (go from 1 unit2 to 2 units2). The year would be 7592, and every circle (where radius is equal to 1) that you would encounter would conceptually be twice the size… Crazy to think about.

But thankfully π is constant, or is it since it is infinite?

Happy π day.

-Cole

Exported from Medium on August 7, 2016.