Thinking about all of the things in the statistical world that we can estimate, the one that has always perplexed me is estimating the size of an unknown population (N). Usually when we compute e...
http://www.bytemining.com/2018/03/estimating-population-size-animals-and-web-pages/
The first few hundred registrations received a mug. As a machine learning practitioner in the Los Angeles area, I was ecstatic to learn that NIPS 2017 would be in Long Beach this year. The confer...
http://www.bytemining.com/2017/12/highlights-from-my-first-nips/
NOTE 1: This is part 1 in a series that will probably contain 3 or 4 parts. Then I will return to the usual data science etc. posts. NOTE 2: This post was intentionally delayed until I received f...
http://www.bytemining.com/2017/04/ph-d-defense-post-mortem-and-advice-for-others/
I’ve never been very big on New Year’s resolutions. I’ve tried them in the past, and while they are nice to think about, they are always overly vague, difficult to accomplish in a year, tri...
http://www.bytemining.com/2017/01/some-new-year-resolutions-for-this-data-scientist-in-2017/
This past three years has really flown. It’s time for me to finally get back to my roots and also start blogging more, like I did previously. My last post was about Strata 2013. During this tim...
In this post I am goIing to summarize some of the things that I learned at Strata Santa Clara 2013. For now, I will only discuss the conference sessions as I have a much longer post about the tut...
http://www.bytemining.com/2013/02/summary-of-my-first-trip-to-strata-strataconf/
Wishing you all a very Merry Christmas, Happy Holidays and Happy New Year! An update on me. In October, I began working at Riot Games, the developers of League of Legends. It has been an amazing ...
http://www.bytemining.com/2012/12/merry-christmas-and-happy-holidays/
Last week I received two Raspberry Pis in the mail from AdaFruit and just now have some time to play with them. The Raspberry Pi is a minimal computer system that is about the size of a credit ca...
http://www.bytemining.com/2012/10/a-new-data-toy-unboxing-the-raspberry-pi/
During the past few decades that I have been in graduate school (no, not literally) I have boycotted JSM on the notion that “I am not a statistician.” Ok, I am a renegade statistician, a stat...
http://www.bytemining.com/2012/08/adventures-at-my-first-jsm-joint-statistical-meetings-jsm2012/
OpenPaths is a service that allows users with mobile phones to transmit and store their location. It is an initiative by the New York Times that allows users to use their own data, or to contribu...
http://www.bytemining.com/2012/07/openpaths-and-a-progressive-approach-to-privacy/
Note: This would have been up a lot sooner but I have been dealing with a bug on and off for pretty much the past month! From April 26-28 I had the pleasure to attend the SIAM Data Mining conf...
http://www.bytemining.com/2012/05/siam-data-mining-2012-conference/
Recently, I participated in an email interview about what being a Statistics major entailed, how I got interested in the field and the future of Statistics. I figured this might be of interest to...
http://www.bytemining.com/2012/03/my-interview-about-the-statistics-major/
Whenever I tell people in my family that I study Statistics, one of the first questions I get from laypeople is “do you count cards?” A blank look comes over their face when I say “no.” ...
http://www.bytemining.com/2012/01/hold-only-that-pair-of-2s-studying-a-video-poker-hand-with-r/
To all of my readers and followers, I wish you a very Merry Christmas and a very joyous and safe Happy New Year! This year, I am thankful for the community that has sprung up around Data Science ...
http://www.bytemining.com/2011/12/merry-christmas-2011-from-byte-mining/
Lately I have doing a lot of work with the Wikipedia XML dump as a corpus. Wikipedia provides a wealth information to researchers in easy to access formats including XML, SQL and HTML dumps for a...
http://www.bytemining.com/2011/11/parsing-wikipedia-articles-wikipedia-extractor-and-cloud9/
My thoughts on data mining, machine learning, programming languages, open-source software and general nerdery.
http://www.bytemining.com/2011/09/lexisnexis-open-sources-its-hadoop-alternative/
<< My review of Day 1. I am summarizing all of the days together since each talk was short, and I was too exhausted to write a post after each day. Due to the broken-up schedule of the KDD sess...
http://www.bytemining.com/2011/08/sigkdd-2011-conference-days-234-summary-3/
I have been waiting for the KDD conference to come to California, and I was ecstatic to see it held in San Diego this year. AdMeld did an awesome job displaying KDD ads on the sites that I visit,...
It’s been a while since I have posted… in the midst of trying to plow through this dissertation while working on papers for submission to some conferences. Hadoop has become the de facto st...
http://www.bytemining.com/2011/08/hadoop-fatigue-alternatives-to-hadoop/
I woke up early and cheery Wednesday morning to attend the 2011 Hadoop Summit in Santa Clara, after a long drive from Los Angeles and the Big Data Camp that lasted until 10pm the night before. Ha...
http://www.bytemining.com/2011/06/my-review-of-hadoop-summit-2011-hadoopsummit/
It has been a while since I have been to Silicon Valley, but Hadoop Summit gave me the opportunity to go. To make the most of the long trip, I also decided to check out BigDataCamp held the night...
http://www.bytemining.com/2011/06/big-data-camp-2011-bigdatacamp/
Recently, I have been thinking about alternate ways of specifying search queries other than with text. A couple of weeks ago I came across a piece of music that I could not identify. I thought it...
http://www.bytemining.com/2011/06/google-is-search-by-multimedia-on-the-way/
I am usually pretty reserved with cash, but after working full-time for six months, I finally decided to spend some of my money on building a new research development server. This process was lon...
http://www.bytemining.com/2011/05/want-to-build-a-research-server-6/
Some time over the past 6 weeks I randomly saw a tweet announcing the “Data Scientist Summit” and shortly below it I saw that it would be held in Las Vegas at the Venetian. Being a Data Scien...
http://www.bytemining.com/2011/05/review-of-2011-data-scientist-summit/
Elastic Compute Cloud (EC2) is a service provided a Amazon Web Services that allows users to leverage computing power without the need to build and maintain servers, or spend money on special har...
http://www.bytemining.com/2011/05/ec2-trials-and-tribulations-part-1-web-crawling/