Jun-Wei Lin, PhD Candidate, Software Engineering, UCI

I am a PhD candidate working with Prof. Sam Malek in the School of Information and Computer Sciences, UC Irvine. My research interest is software testing, especially in test suite reduction and web app testing with machine learning and natural language processing techniques. I am the co-founder of Pycone, a startup offering online Python courses. I am also the lecturer of an online course "Web Crawling with Python" with 630+ students. My side projects on Github have 200+ stars and 100+ forks.

Email: junwel1 at uci dot edu


  • Programming Languages: Python (3+ year experience), Java, C++, JavaScript, SQL, Bash (basic familiarity).
  • Web Development: Django, jQuery, Bootstrap.
  • Continuous Integration: Jenkins, Robot Framework, Selenium, PhantomJS, JMeter.
  • Machine Learning and Natural Language Processing: scikit-learn, gensim, NLTK, jieba

Publications (110+ citations) (Google Scholar / ResearchGate)

Projects (GitHub)


Predicting the Best Answers for Questions on Stack Overflow

Term project of CS295 Statistical NLP, Winter 2018. Applied various ML models (e.g., Random Forest and XGBoost) and NLP techniques (e.g., Latent Semantic Indexing) to predict the best answers for questions labeled ”Python” on Stack Overflow. Outperformed baseline by 8.5%. (report)

Kaggle Competition: Rainfall Prediction (7/126, top 6%)

Term project of CS273A Machine Learning, Fall 2017. Used ensembles (e.g., Random Forest and XGBoost) and feature engineering (e.g., missing data handling) to predict rainfall based on the infrared information.

AlphaTrip (Hackathon project)

A prototype using natural language processing and machine learning to categorize attractions near Tokyo and schedule trips. Collaboratively developed and demoed at the Big Data X Maker Hackathon, Taipei. (presentation, press)

PTT Web Crawler (150+ Stars and 80+ forks on GitHub)

Crawler and data parser for the web version of PTT, the largest local online community in Taiwan.

Bulletin Board for Government Job Opportunities (700+ daily users)

Data visualization for open data provided by Taiwan's government.

Research Projects


b2g-monkey (Python)

Fully automatic interface crawler and invariant checker for Firefox OS App.
Aug. 2015-Dec. 2015, "Automatic Test Case Generation for HTML5 Apps", Institute for Information Industry, Taiwan.

django-blast (Live site) (Python, Django)

An interactive visualization tool translating texts obtained from sequence similarity searching into sortable and searchable tables and graphs (Collaboratively developed).
May 2014-May 2015, "i5k Workspace@NAL", National Agricultural Library, USDA. (report, paper)

ITRI Cloud Testing Service (Java)

Applied data and behavior mutation to test cases, leveraging computing power provided by ITRI.
Nov. 2013-Feb. 2014, "ITRI Cloud Testing Service", Industrial Technology Research Institute, Taiwan. (report)