Menggy Technology


  • Have you had problems in extracting, cleansing, organizing and formatting a dataset?
  • Have you got difficulties in computing variables on a very huge dataset?
  • Have you been troubled in extracting specific content from a large number of PDF files or in dealing with a messy Excel file?
  • For academic researchers, have you ever been struggled in implementing highly-complicated programs for your psychology studies or marketing survey with Qualtrics, and finally find out it doesn't work?
Save TIME, Focus on RESEARCH. We are MORE PROFESSIONAL.

Web Crawling and Scraping


Collected 25,100,346 Web pages. Extracted 78,469,667 records. Sized 19.79 GB.
www.jobstreet.com.sg
carousell.com
www.facebook.com
www.glassdoor.com
fanpagelist.com
www.lenovomm.com
store.steampowered.com
www.kimiss.com
www.xueqiu.com
www.xinshipu.com
www.tripadvisor.com
www.topuniversities.com
www.szse.cn
www.sse.com.cn
www.propertyguru.com.sg
www.playdota.com
www.nuomi.com
www.linkedin.com
www.laoyaoba.com
www.iyp.com.tw
www.itjuzi.com
www.iteye.com
www.imdb.com
www.hclips.com
www.google.com
www.github.com
www.ebay.com
www.douban.com
www.digikey.com
www.data.gov.sg
www.cuaa.net
www.crowdspring.com
www.ccug.net
www.bloomberg.com
www.51baomu.cn
dzh.mop.com
beijing.anjuke.com
bbs.taobao.com
bbs.sgcn.com
bbs.hupu.com
Output can be encoded and formatted based in various types.

UTF-8

GB2312

TXT

CSV

EXCEL

SQL

JSON

Data Processing


We can process MOST commonly seen data sources.

PDF
EXCEL
HTML
JSON
TXT
CSV
SQL

We can process millions of files or a huge dataset.

  • Extraction: Efficiently and effectively extract useful content from text files or a dataset.
  • Cleanse: Filter out dusty data points.
  • Organization: Remove redundancies and lower dependencies based on ER paradigm.
  • Format: Standardize to any formats, as such that can be further analysed directly.

Data Computing


  • Basic Statistics: Average, Sum, Mean, Median, Standard Deviation, Variance.
  • Network: In/out Degree, Size, Closeness Centrality, Reach Centrality, Shortest Path.
  • Datetime: Time Duration, Days of the Week, Number of Days.
  • Customized formulas or algorithms.

Research & Web Development


  • Natural Language Processing (NLP): Word Counting, EN/CN Word Segmentation, Named Entity Recognition, Sentiment Analysis, POS Tagger.
  • Regression Analysis: Linear, Logistic, Polynomial, Stepwise, Ridge, Lasso, ElasticNet.
  • Research-level Web Development: Websites in Marketing survey or Psychology studies which requires very complicated functions, such as Condition Randomization, Time Tracking (in milliseconds) or need to collaborate with other devices, such as Skin conductance, BPM, EEG.

Our Clients


Nanyang Business School NUS Business School