Smart Resume Analyser: A Case Study using RNN-based Keyword Extraction

. The pursuit of a reputable position in a company is a common goal shared by many people. For making this desire into reality their resume is the key to it. Making these resumes is a challenging task for many and modifying it is another time-consuming and burdensome task. Taking these problems into consideration and to make eye-catching resumes we have developed a smart resume analyser that can analyse the user's resume, it intelligently identifies their skills and qualifications, enabling us to suggest the best-suited job titles for their profile. Furthermore, based on this analysis, our system generates recommendations for optimizing and enhancing the user's resume, making it more appealing to potential employers. Through the utilization of NLP and ML technologies, our resume analyser provides personalised and effective solutions to job seekers, helping them stand out in today's competitive job market. The information on the resume can be analysed and understood easily as NLP has the capability to understand and parse the information on the resume and extract the desired information efficiently. With the additional help from Python and its packages like streamlit, pymysql, etc. we can store the extracted data and give a rating based on the analysis of the stored data. After the rating is generated, we recommend some modifications like: adding sentences like my hobbies are, my goal is, add objectives, and add declaration and more. We also suggest a few sources through which the user can enhance their skills.


Introduction
A smart resume screening system employs Natural Language Processing (NLP) to extract relevant information from unstructured resumes.The system intelligently identifies the candidate's skills and qualifications, enabling us to suggest the best-suited job titles for their profile.Furthermore, based on this analysis, our system generates recommendations for optimizing and enhancing the user's resume, making it more appealing to potential employers.Through the utilization of NLP and ML technologies, our resume analyser provides personalized and effective solutions to job seekers, helping them stand out in today's competitive job market.With the additional help from Python and its packages, we can store the extracted data and give a rating based on the analysis of the stored data.After the rating is generated, we recommend some modifications like adding sentences like my hobbies are, my goal is, add objectives, and add declaration and more.The traditional process of sifting through resumes can be time-consuming and inefficient, with employers often receiving hundreds or even thousands of resumes for a single job opening.NLP is about enabling computers to understand human language, which is a challenging task.Computers are good at comprehending structured data like spreadsheets and databases, but they struggle with unstructured data such as text and speech.NLP helps computers process and understand this type of data, which can be found in various forms.This interface of proposed method also offers benefits for job seekers, as they can receive personalised feedback on their resumes and recommendations for how to improve their chances of being hired.With Smart Resume Analyser, job seekers can ensure that their resumes are optimised for the positions they are applying for, increasing their chances of landing their dream job.Thus, the Smart Resume Analyser interface of proposed method has the potential to revolutionize the job application process, making it more efficient and effective for job seekers.

Literature survey
Satyaki Sanyal and team [1] represents the approach of resume analysis software that automates the extraction and analysis of relevant information from uploaded resumes.The software employs NLP, ML, and data mining techniques to identify keywords, patterns, and trends.It evaluates resumes based on job requirements, filtering and ranking them accordingly.This approach simplifies recruitment, reduces recruiter workload, and identifies the most suitable candidates for a job.Vaidya and team [2] the approach described here is a method used by resume analysers to extract relevant information.The process involves eliminating stop words and applying the Soundex algorithm to group similar-sounding words.Keywords are then identified and stored in a database.
Daryania and team [3] represents an automated resume screening system using NLP to analyse resumes and rank them based on similarity to job requirements.The system identifies keywords, skills, and qualifications, comparing them to the job description to generate a relevance score.Resumes can be ranked accordingly, saving time for recruiters.Nawander and team [4] represents an approach using NLP and Streamlit modules to extract data from PDF resumes, store it in a database, and analyse it for ranking.The system suggests improvements for resume representation, including formatting, layout, and language.
Pokhare [5] represents, the approach using NLP and machine learning to parse, extract, and summarize data from PDF resumes.The system identifies sections such as contact details, education, and work experience.NLP and machine learning techniques extract keywords, skills, and qualifications, and analyse language used in describing the candidate.Data is stored in a database for comparison and analysis.Pravesh and team [6] represents the approach which involves the usage of machine learning and the K-Nearest Neighbors (KNN) algorithm to match job requirements with candidate resumes.
Kelkar and team [7] represents a proposed Company Recommender System for selecting the best-fit candidate.The system utilizes text mining and machine learning to rank candidate resumes based on company requirements.Relevant information is extracted from resumes using text mining techniques and compared to job requirements, resulting in a score.A machine learning model is trained using historical data to identify patterns between requirements and candidate qualifications.This model ranks resumes, and a recommendation engine provides recruiters with a list of recommended candidates.Sinha and team [8] represent a proposed method for Resume Screening using Natural Language Processing (NLP) and Machine Learning (ML) algorithms.The aim is to automate the time-consuming process of CV screening by training the machine to analyse and extract relevant information from unstructured written language.Shubham Bhor and team [9] represents a proposed solution for resume parsing using NLP techniques.The aim is to simplify the process of finding suitable candidates for job openings.Resumes are uploaded to a job portal and parsed to extract important details.
Sanjana and team [10] represents resume validation and filtration using some inbuilt NLP Techniques and filtering jobs.It also checks resume validation accordingly.Recruiting has become a laborious process nowadays, with an overwhelming number of resumes being submitted for job openings.It's impractical for recruiters to review each one manually.Gupta and team [11] represents an automated resume screening system that uses NLP and machine learning techniques to analyse and rank resumes.The system uses NLP techniques such as part-of-speech tagging and named entity recognition, along with machine learning algorithms.Thakur and Goyal [12] represents Resume Analysing using ML Based techniques Resume Classification System (RCS) using the NLP and ML techniques could automate this tedious process.Moreover, the automation of this process can significantly expedite and transparent the applicants' screening process with mere human involvement.This model only focuses on reducing the recruiter's work, but it does not recommend any courses which can add weight to the applicants resume.This paper suggests us that using NLP & RCS and some ML techniques it does some automation work and gives us desired results.
Authors [12] highlighted the significance of ML in prediction, pattern recognition and error reduction across diverse fields, emphasizing the impact of AI in broad domain.Author [13] presented text classification algorithms for various applications and explores the use of machine learning in detecting phishing attacks.Authors [14] discussed the use of machine learning and neural networks, especially CNN, for recognizing handwriting patterns, with a focus on Telugu film industry names, achieving high accuracy (98.3%).Authors from [15] evaluated the sentiment analysis feelings of mankind based on reviews or comments, followed by categorizing them as negative, neutral, or positive.The present work explores sentiment analysis architecture and tools for user-friendly opinion mining.The approach [16] utilized Advanced Deep Learning with global threshold to improve Ecommerce product classification, achieving high accuracy and challenging existing technology.

Drawbacks of existing approaches
The drawbacks in all the existing papers used ML Based Techniques & some algorithms after screening or parsing the data that is unstructured or structured data it accepts or rejects the applications it only matches to the conditions set by the recruiters it doesn't give any type of recommendation.These are the few drawbacks of the existing papers.

Proposed work
The initial step of the interface of proposed method involves obtaining resumes from the participants in the form of either a PDF or Word document format.The participants were informed that the maximum allowable file size for submission is 200MB.Upon receipt of the submitted resumes, a prompt display of the uploaded file is provided, allowing participants to review their submission and edit it if necessary.The uploaded resumes are then securely stored in our system for subsequent phases of the interface of proposed method.
The file that has been uploaded by the user is accepted by us.Now we move into the second phase where we extract the data from the uploaded resume.To do the extraction of the data efficiently we take the help of the module pyresparser.We call the pyrespareser functions and store their return values in a variable.The pyresparser when called extracts the user's name, the user's email id, the user's phone number, the user's skills, The user's total experience, and more from the uploaded file and returns it to the place where the function was called.The module pyresparser internally uses spacy and NLTK modules which can provide the pyresparser to do NLP operations.pyresparser takes the uploaded resume as the input and returns a list of dictionary objects, each key of this dictionary represents a medium to identify a person to elaborate this statement with an example the medium can be: name, email id, and phone number these are unique characteristics used to identify a person.The value for these keys can be: the name of the person, the phone number of a person, etc, and this information is extracted from the user's resume by this module.Finally, this model defines a dictionary in a list.The keys in the dictionary are a set of names that are the unique identifiers of any person like name, phone numbers and the values for these keys are the corresponding information in the user's resume.

Objectives of the proposed work
• To identify most qualified candidates for a particular position for any job opening.
Useful data from the resume, e.g., education, skills, achievements, experience and so on, should be automatically generated by the tool.• To shorten recruitment time and reduce bias during the selection process, with a view to ranking candidates based on various aspects of their resume.

Architecture of the proposed work
The architectural diagram can be considered as a flow chart, and every component of an architecture diagram has some meaning for us to understand.The circle indicates the start/stop, diamond/rhombus represents a decision or branching point.The lines from the diamond are indicative of several possible scenarios which lead to different sub processes.Some specific operation is shown in the rectangle.The flows of process sequence and direction are shown on the lines.Now with the help of the variable having the users' data we display the user's data.Now we analyse the skills the user has and predict the type of job that best fits the user.After knowing the role which suits the user, we consider the skills the user has and predict the skills the user does not have.This data is saved, and we move to the next phase.Now we consider the number of pages and the skills the users must generate a rating.Once this rating is generated, we display this rating to the user and proceed to the next stage.Now we consider the list of skills the user lacks and recommend those skills along with a link to learn those skills from.After this, we analyse the total resume and generate a score.Based on this score we recommend the user to add sentences like add a list of hobbies, your career objective, and more which were initially not present in the uploaded resume.Makes the resume eye-catching.Finally, we recommend additional tips on how to answer the question and the basic question asked in the interviews and more which might help the user to perform better in the interviews, etc.

Automated resume screening
Automated screening of resumes is the primary goal of a smart resume analyser interface of proposed method.To this end, each resume shall be analysed in order to determine whether it is compatible with the applicant's requirements.

Enhanced accuracy
The improved accuracy of resume screening is helped by the use of machine learning algorithms and advanced techniques.This means that there is no inherent bias when it comes to selecting the best suited candidates.

Augmented candidate experience
The objective of smart resume analyser method is also to improve the experience of jobseekers by providing them with feedback and suggestions to improve their CVs and overall applications.

Modules-connectivity diagram
The module diagram serves as an extended data flow diagram that plays a crucial role in defining the business logic of a system.It consists of the following key elements: • Components: Components, such as computation or message displays, are functions blocks in the system.These elements have input and output ports which allow data to be exchanged between them, represented as rectangles.• Data Flow: Moving data in a component is depicted by the flow lines.The sequence in which elements are implemented, and the amount of data flowing through a system, is defined by these lines.Depending on the type of data, they may be defined as lines or colour codes.• Control Flow Constructs: Control flow constructs, such as those for Schneiderman boxes, allow you to define module logic in line with the principles of Structured Programming; therefore, they are a more satisfying addition to your module diagram.At the same time, those constructs include timed execution, looping, dependent branching and recursive module calls.• Module Events: One or more modules shall be covered by the module diagram and each of these events is shown individually within it.The details of each module event can be displayed to the user by clicking on it or choosing a set of events.You will see below the module's name that it is a module event.

Description about the dataset
In this model, we just use a base resume on which we perform the operations and expect the desired results we took this resume and performed the operations.3 shows the first Resume to be uploaded.Resumes are uploaded from the local system by browsing the file.After uploading the resume, it is analysed.Figure 5 shows the Resume is being uploaded

Conclusion and future enhancements
Our program accepts resumes in PDF or Word format using the streamlit module and a file uploader.The parsed information is extracted from the uploaded file using pyresparser, which stores it as a list of dictionaries.We display the parsed information to the user and analyze their skills to predict the best-suited position.We identify the skills the user lacks and recommend learning resources.The is rated based on skills, content, and other factors, and a rating is provided.We recommend links for interview preparation.NLP and Spacy modules are used for parsing and analysing the resume.Streamlit module enhances the interface design, while paffy.pydisplays YouTube links for learning missing skills.This program aims to help users improve their resumes and make them more eye-catching.Future enhancements may include analysing social media data for personality insights and work style.Smart resume analyser works if the user's skills are related to the computer science domain or a resume of an IT employee.If a resume has skills beyond the computer science domain, the prosed method cannot recommend courses or give a rating.Rectifying this error is another future enhancement.