Construyendo la base.

La semana pasada comentabamos acerca de los retos en la evaluación para lograr mediciones certeras de los objetivos y los logros de un PPE para posteriormente generar un juicio de valor. En principio…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Our Data Science robot intern

Automating the boring parts of machine learning

Machine learning practitioners know how overwhelming the number of possibilities that we have when building a model can be. It’s like going to a restaurant and having a menu the size of a book and we never tried any of the dishes. What models do we test? How do we configure their parameters? Which features do we use? Those who try to solve this problem by ad-hoc manual experiments end up having their time consumed by menial tasks and have their work constantly interrupted to check results and launch a new experiment. This has motivated the rise of the field of automated machine learning.

The main tasks of automated ML are hyperparameter optimization and feature selection, and in this article, we will present Legiti’s solution to these tasks (no, we didn’t hire an intern to do that). We developed a simple algorithm that addresses both challenges and is designed for rapidly changing environments, such as the ones found by data scientists working in real-world business problems. Some convenient properties and limitations are presented, as well as possible extensions to the algorithm. We end by showing results obtained applying it to one of our models.

At Legiti, we build machine learning models to fight credit card transactional fraud; that is, to predict whether an online purchase was made by (or with the consent of) the person who owns that credit card or not. We know when fraud occurs after some time when a refund is requested by the owner of the credit card, a process known as a chargeback. This is translated into our modeling as inputs for a supervised learning algorithm.

If you already have a good knowledge of common tools for feature selection and hyperparameter optimization you can skip to the “An outline of (the first version of) our algorithm” section.

Add a comment

Related posts:

A Letter to Everyone Who Is Afraid

When was the last time you actually told someone how you were doing, when they asked you? I mean a proper descriptive answer, not in a meme or an emoji. We both know those are just cop outs. I mean…

Does cannabis truly influence memory? This is what research as of now says

As authorities investigate cannabis and the impact that it has on human wellbeing, they’re starting to all the more likely comprehend the impact it has on the human cerebrum — and whether cannabis…

8.5 Using Git Submodules to Track External Repositories

Sometimes you need to track multiple repositories as if they’re all in the same repository. This might be because of a dependency on some third-party library or possibly because your in-house project…