Blog
66 posts on machine learning, data science, data engineering, and software development. Archive →
Aug 23, 2025 Claude Code + MCP: AI Development with Bedrock Sep 24, 2017 Provision AWS EC2 cluster with Spark version 2.x Mar 12, 2017 Streaming with Apache Storm Mar 1, 2017 Data ingestion and loading: Flume, Sqoop, Hive, and HBase Feb 26, 2017 Streaming processing (III): Best Spark Practice Feb 25, 2017 Streaming processing (II): Best Kafka Practice Jan 6, 2017 Streaming processing (I): Kafka, Spark, Avro Integration Dec 25, 2016 Deep Sentiment Prediction as Web Service Jun 19, 2016 Track my sports Mar 17, 2016 Deploy ELK stack on Amazon AWS Feb 17, 2016 Build a simple web application with Amazon AWS Feb 1, 2016 Spark on time series preference data Jan 31, 2016 GPU computation on Amazon EC2 Jan 21, 2016 2015年NIPS会议中酷炫的东西 - Neural Style Jan 5, 2016 Cool stuff in NIPS 2015 (symposium) - Neural Style Jan 1, 2016 A super fabulous beginning of a super great year 2016 Dec 31, 2015 Data science in the next 50 years - are machine learning and statistics complementary? Dec 26, 2015 Cool stuff in NIPS 2015 (workshop) - Non-convex optimization in machine learning Dec 25, 2015 Cool stuff in NIPS 2015 (workshop) - Time series Dec 21, 2015 A rich and dynamic December Dec 20, 2015 My research on machine learning and AI Dec 16, 2015 NIPS conference 2015 Dec 15, 2015 Me Nov 19, 2015 Build web applications with Flask+Heroku Nov 11, 2015 Calendar view of data in Jekyll with D3.js Nov 9, 2015 Xplanner in Junction Hackathon 2015 Nov 2, 2015 Documentation and test modules for Python Oct 29, 2015 Teaser solution Oct 20, 2015 Pabulo, my lovely cat Oct 19, 2015 Chinese national day celebration in China embassy Helsinki Oct 19, 2015 Spark regression models Oct 18, 2015 Spark classification models Oct 13, 2015 Spark with Python: collaborative filtering Oct 12, 2015 Feature extraction, selection and predictive modeling with Scikit Oct 10, 2015 Novelty detection and outlier detection with Scikit Aug 28, 2015 One class classification with Scikit Aug 25, 2015 Predicting transporter proteins Aug 24, 2015 Searching Algorithm Aug 20, 2015 BFS and DFS Aug 18, 2015 SQL related Aug 16, 2015 Compute TF-IDF with Hadoop Python Aug 15, 2015 Mapreduce with Hadoop via Python with Examples Aug 13, 2015 Scikit: A machine learning package for Python Aug 12, 2015 Get Emoji support for Jekyll pages Aug 12, 2015 Outstanding doctoral candidate award of 2014 Aug 3, 2015 Heap Jul 30, 2015 Stack and Queue Jul 29, 2015 Dynamic programming related problems Jul 29, 2015 Recursion Jul 27, 2015 Setup Hadoop on Macos Jul 26, 2015 Spark via Python: basic setup, count lines, and word counts Jul 22, 2015 Palindrome problems Jul 19, 2015 SQL refreshment Jul 17, 2015 Sorting algorithms Jul 12, 2015 Bit integer for operating large numbers Jun 17, 2015 Feature extraction for protein sequences via InterProScan Jun 16, 2015 Sequence alignment with NCBI-BLAST search Jun 10, 2015 Tiny little bit of Python Pandas Jun 9, 2015 Facebook challenge of detecting robots May 22, 2015 A projected Newton method for optimizing structured output model May 21, 2015 Spark with Python: optimization algorithms May 17, 2015 Spark with Python: linear models in MLlib May 15, 2015 Some useful Coding techniques May 12, 2015 Spark with Python: configuration and a simple Python script May 11, 2015 The quickest way to blog, GitHub + Jekyll Dec 29, 2011 Untitled