Home Azfar I. – Data Scientist

Azfar I.

My work for Geescore™ revolves around the Natural Language Processing and Machine Learning domain for Parsing and Scoring services.

Geescore™ scores a Jobseeker for any job posting anywhere, in about 3 seconds. The scoring is dynamic; it can change as the Jobseeker interacts with the service (R Tool) and adds data (documents), shares social links, and helps to fix issues discovered.

One of the core aspects of such a tool is the ability to parse both resumes and job postings, and this is one of the key areas that I deal with.
Parsing resumes

Geescore™accepts txt, doc, docx and PDF formats. . Resume sections are identified and separated using a combination of Machine Learning and advanced parsing rules. Once a resume has been parsed, it is stored into the database.

Parsing job postings

Parsing job postings is a similar but more complicated task. A web scraping engine first extracts the text and html from the URL of the job posting. The content of job postings can typically be segmented into different sections such as job description, company description, responsibilities, qualifications, contact information etc.
The Machine Learning solution we developed for parsing job descriptions uses a custom Named Entity Recognition classifier. This model has been trained on tagged job descriptions to identify section headers or in some cases, section values.

Scoring resumes

For scoring resumes, we have a sophisticated algorithm in place that considers multiple real world recruiting factors including matching science, commuting distance, interest in the job, work longevity and custom scoring for specific end Client business domains eg. fast food restaurants.
The matching science scoring factors in generic and domain specific keywords, keyphrases and acronyms from the Jobseeker’s resume, and Job posting.

Tools and Technologies

All of this has been developed in Python 3. Some of the modules being used are:
scikit-learn
Spacy
Rake-nltk
Pypostal
Fuzzywuzzy