ML API

The workflow of the recommendation system that we created is outlined below.

[Lokergo Flowcharts](https://drive.google.com/file/d/1TDD3FxPrt1yLpuH-R_WcjDbCoGbuIboV/view?usp=sharing). Link to our online flowchart.

Model A refer to all-MiniLM-L6-v2 and Model B refer to sts-trained-lokergo
The input shape highlighted in color is performed only once every day after the job data scraping, as a form of computational resource efficiency using a bi-encoder model.
'u' and 'v' represent the encoded sentences.
We combine both the Normalized Cosine Similarity of User Job Preferences and User Skills using a Combiner with the calculation x(weight) + y(weight), where the weight serves as custom weighting between the two Cosine Similarities, with a default value of 0.5. encoder

More Information

We utilize all-MiniLM-L6-v2 model as a lightweight bi-encoder without retraining and sts-trained-lokergo model as a retrained cross-encoder using transfer learning from sentence-t5-base model, specifically designed for sentence similarity tasks. Both are employed to calculate user preference similarity with job titles. We use TF-IDF to measure the similarity between user skills and job skills due to its lightweight and fast nature, because its ability to work without requiring dense contextual understanding. We selected model all-MiniLM-L6-v2 for its lightweight design and decent accuracy, while we opted for model sts-trained-lokergo due to its high accuracy, despite its heavier computational requirements. Consequently, we use model sts-trained-lokergo exclusively to compute sentence similarity within the top 100 results obtained from model all-MiniLM-L6-v2.

For Collaborative Filtering, we employ TF-IDF to calculate similarity in preferences between User A and other users. The highest similarity value indicates significant user similarity. Then, users with similar preferences exchange their content-based recommendations with each other, which are then displayed in the third recommendation section on the website platform. Finally, the algorithm is deployed using Fast API to Google Cloud Run, connected to the backend database.

Three recommendation sections on the website:

Content-Based Filtering
Content-Based Filtering + User Location Filter
Collaborative Filtering

Future Improvements:

Increase the number of data sources from other platforms.
Enhance model performance by expanding the training dataset.
Modularize the algorithm code for improved scalability.

Reference & Tech stacks

HuggingFace STS Model Ranking. Link to the similarity task model leaderboard.
Difference of Bi-Encoder & Cross-Encoder. Detailed explaination of bi-encoder and cross-encoder
sts-trained-lokergo Model. Our trained model was hosted into HuggingFace repository for easy access

C23-VR01 ML Teams.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Dockerfile		Dockerfile
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML API

The workflow of the recommendation system that we created is outlined below.

About

Releases

Packages

Languages

Lokergo-Dev/ml-api

Folders and files

Latest commit

History

Repository files navigation

ML API

The workflow of the recommendation system that we created is outlined below.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages