Easter Eggs, Bats, and Bubba

Yes, his name was Bubba. Yes, this story is true.

“You boys get over here!”, my stepfather shouted. The boys stopped and turned. My stomach turned too. These four boys always wanted to fight me. I just had to dodge them on the weekends I visited my Mom, and for months this had worked. But today they surprised me as I left Easter services — they cornered me and beat me with a baseball bat. I know how terrible that sounds but you should know it was a measured violence. [Read More]

Using dot to generate a hormone map

While studying for the MCAT, my friends and I wrote up hormone map on the whiteboard. I took snapshot with my phone and translated it to the dot language.

Here’s the dot language spec, if you’re interested.

// Hormone graph
digraph G {
	//rankdir=LR;
	node[color="green"];
	GH; FSH; LH; TSH;Melatonin;ACTH;OX;ADH;ENDOR;PRO;CALCI;PTH;ANP;THYM;INS;SOM;GLU;
	node[shape="plaintext"];
	H;AP;PP;G;T;KIDNEY;AC;BM;PG;AM;BMa;PT;B;HEART;URINE;Thymus;TCELL;PAN;UTERUS;MAMMARY;IMMUNE;
	node[shape="oval",color="black"];
	BMa[label="Bone Marrow"];
	H [label="Hypothalamus"];
	AP [label="Anterior Pituitary"];
	PP [label="Posterior Pituitary"];
	BM [label="Bones\nMuscle"];
	PG [label="Pineal Gland"];
	GH [label="Growth\nhormone (GH)"];
	TSH [label="Thyroid-stimulating\nhormone (TSH)"];
    LH [label="Luteinizing\nhormone (LH)"];
	FSH [label="Follicle-stimulating\nhormone (FSH)"];
	ACTH [label="Adrenocorticotropic\nhormone (ACTH)"];
	ADH [label="Antidiuretic\nhormone (ADH)"];
	G [label="Gonads"];
	T [label="Thyroid"];
	GHRH[label="Growth hormone-releasing\nhormone (GHRH)"];
	TRH[label="Thyroid-releasing\nhormone (TRH)"];
	GNRH[label="Gonadotropin-releasing\nhormone (GnRH)"];
	CRF[label="Corticotropin-releasing\nhormone (CRH)"];
	PT[label="Parathyroid"];
	PTH[label="Parathyroid\nhormone (PTH)"];
	B[label="Blood"];
	PAN[label="Pancreas"];
	INS[label="Insulin"];
	SOM[label="Somatostatin"];
	GLU[label="Glucagon"];
	HEART[label="Heart"];
	ANP[label="Atrial Natriuretic\nPeptide (ANP)"];
	URINE[label="Urine"];
	TCELL[label="T-Cell"];
	THYM[label="Thymosin"];
	KIDNEY[label="Kidney"];
	AM[label="Adrenal Medulla"];
	OX[label="Oxytocin"];
	ENDOR[label="Endorphins"];
	PRO[label="Prolactin"];
	AC [label="Adrenal Cortex"];
	CALCI[label="Calcitonin"];
	T4[label="T4 Thyroxine\n(AAd)"];
	UTERUS[label="Uterus"];
	MAMMARY[label="Mammary\nGlands"];
	IMMUNE[label="Immune\nSystem"];
	CORTISONE[label="Cortisone"];
	EPO[label="EPO Erythropoietin"];
	ALDOSTERONE[label="Aldosterone"];

    H -> PP[label="soma\naxon", style="dotted"];
	PP -> {OX ADH};
	OX -> UTERUS[label="Contractions"];
	ADH -> KIDNEY[label="Increased\nH2O\nreabsorption"];

   	subgraph cluster_portal {
	   label="Portal System/ Stalk";
       GHRH; GNRH; TRH; CRF; 
   	}
	H -> {GHRH GNRH TRH CRF};
	GHRH -> AP[color="red"];
	GNRH -> AP[color="orange"];
	TRH -> AP[color="blue"];
	CRF -> AP[color="brown"];

	// Anterior Pituitary
	AP -> GH[color="red"];
	GH -> BM;
	AP -> LH[color="orange"];
	LH -> G;
	AP -> FSH[color="orange"];
	FSH -> G;
	AP -> TSH[color="blue"];
	TSH -> T;
	AP -> ACTH[color="brown"];
	ACTH -> AC;
	AP -> ENDOR;
	AP -> PRO[color="blue"];
	PRO -> MAMMARY[label="Milk\nproduction"];

	T -> "T3 Triiodothyronine\n(AAd)";
	T ->  T4;
	T -> CALCI;
	CALCI -> B[label="Decreased\nCa2+"];

	PG  -> Melatonin;

	subgraph cluster_glucocorticoids {
		label="Glucocorticoids";
		CORTISONE;Cortisol;
	}
		
	AC -> {CORTISONE Cortisol};
	CORTISONE -> IMMUNE[label="Suppresses"];
	CORTISONE -> B[label="Increased\npressure"];

	subgraph cluster_mineralcorticoids {
		label="Mineralcorticoids";
		ALDOSTERONE;
	}

	AC -> ALDOSTERONE;
	ALDOSTERONE -> KIDNEY[color="red", style="dashed"];
	KIDNEY -> EPO;
	KIDNEY -> URINE[label="Conserve Na+\nSecrete K+\nHold H2O", color="red", style="dashed"];
	EPO -> BMa[label="RBC\nproduction"];

	AM -> "Norepinephrine";
	AM -> "Epinephrine";

	G -> {Estrogen, Progesterone};

	PT -> PTH;
	PTH -> B[label="Increased\nCa2+"];

	subgraph cluster_pan {
		style=filled;
		color=white;
		PAN -> INS;
		INS -> B[label="Decreases\nGlucose"];

		PAN -> GLU;
		GLU -> B[label="Increases\nGlucose"];

		PAN -> SOM;
		SOM -> GLU[label="Suppresses"];
		SOM -> INS[label="Suppresses"];
	}

	HEART -> ANP;
	ANP -> KIDNEY[color="blue", style="dashed"];
	ANP -> ALDOSTERONE[label="Inhibits"];
	KIDNEY -> URINE[label="Release Na+\nIncreased volume", color="blue", style="dashed"];

	Thymus -> THYM;
	THYM -> TCELL[label="Development\nDifferentiation"];
}

Your DNA holds over 60 zettabytes of data

Your DNA holds over 60 zettabytes of data. That’s about 5,000 times the estimated information content of all human knowledge. There are four nucleobases in DNA, adenine [A], cytosine [C], thymine [T] and guanine [G], which require 2 bits each to store Each haploid cell (sperm or egg) in your body is made of 3,234.83 million base pairs Your somatic cells have twice as many base pairs with one set coming from your dad and the other coming from dear old mom There are an estimated 37. [Read More]

Introduction to Base Quality Score Recalibration (BQSR)

Thanks to Chris Hartl for writing the initial implementation of BQSR for ADAM and for taking the time to share his knowledge of BQSR with me over cappuccino at People’s Cafe. Hopefully this post will help others who are trying to understand how BQSR works. Drop a comment if you have any questions. DNA sequencing machines provide an estimate of the quality of each base (e.g. A, C, T or G) that they read. [Read More]

A powerful Big Data trio: Spark, Parquet and Avro

Note: A cleaner, more efficient way to handle Avro objects in Spark can be seen in this gist I love open-source projects that play nicely with others; no one likes to be locked into a single data processing framework or programming language. Mature open-source projects build software with integration and openness in mind to allow engineers to attack Big Data problems from a number of different angles using the most appropriate tool for the job. [Read More]

Playing with matches and CIGARs

Aligned reads in a SAM or BAM file typically have a Compact Idiosyncratic Gapped Alignment Report (CIGAR) string that expresses how the read is mapped to the reference genome.

Table of Cigar Operators

When I first read the CIGAR operator table (above), I was confused by two things:

  1. the match, M, operator description, “alignment match (can be a sequence match or mismatch)“, struck me as odd.
  2. the relationship between the M, = and X operators isn’t explained in the spec.

I hope this blog post helps others with the same questions.

[Read More]

Chabot 50K Trail Run Race Report

I ran my first 50K today – the Chabot 50K Trail Run. The volunteers at Inside Trail Racing impressed me with their professionalism, friendliness and genuine concern for my well-being. Inside Trails Racing put on one of the best trail runs I’ve been a part of. As an example, a volunteer at the Two Rocks aid station (~mile 23) gave me one of her personal water bottles when she saw I wasn’t carrying one (since I forgot it at home). [Read More]

Late 2009 iMac HDD Replacement

The harddrive in my iMac (Late 2009 27”) died last weekend and I decided to replace it myself. Here’s some quick tips if you find yourself in the same situation. Not sure if disk errors are your problem? Boot you Mac and press “Command-V” during startup for verbose boot output. You’ll see messages about “Disk I/O Error” during boot.

There’s a great tutorial on iFixit that explains step-by-step how to replace the drive. I found my iMac had a 3.5” Hitachi Model HDE721010SLA330 SATA 3.0 Gb/s drive once I cracked it open. You can replace the drive with any 3.5” SATA drive you like. I chose to replace it with a comparable Western Digital drive that had more cache.

[Read More]