Overview
When training and using models in Rev.Up, users often want detailed answers from under the hood. Here, we answer many of these questions and continue to add new ones.
General FAQ
Can you train new iterations of a model after you have activated it?
Yes. Your newly trained iterations of the model will not impact the scoring with the active iteration. When you have reviewed the performance of the new iteration and are ready to use it for scoring, just activate the new iteration. The next Data Processing & Analysis job will then use the new iteration to compute model scores.
What algorithm is behind propensity scoring?
Propensity scoring uses binary classification to score your records. The Event column in your training data provides target labels for the supervised learning algorithm.
For Cross-sell models that work with transaction data, the Event column is created for you based on your settings, and uses point in time evaluation of your transaction history data.
What algorithm is behind revenue scoring?
The Cross-sell revenue scoring has two factors that contribute to the score:
- Propensity modeling (see above)
- Revenue modeling
Revenue modeling uses regression methods to estimate the revenue of a deal, assuming that it closes. Since deals may or may not close, the two factors together lead to the best prioritization strategy to maximize revenue across all targets.
What is a "negative universe"?
Sometimes the top-of-funnel view that Account Fit models use to estimate relative lift is called the "negative universe". But take this with a grain of salt - many of these are accounts that just haven't made it down the funnel yet.
When you score this universe with the model, the ones that are more likely to convert will get the higher scores.
How does Rev.Up identify personal email domains?
Rev.Up maintains an ever-growing list of personal email domains. Incoming email domains are checked against this list to determine whether they are personal emails.
Are the sizes of slices in the donut chart important and what do they mean?
The donut chart in the Attribute Analysis tab shows a single-variable analysis for each attribute considered for your model. The different sizes of the attributes in the donut chart give a visual sense of the scale of predictive power for the attributes. The model has features selected from among all attributes seen in the donut chart.
How is the Feature Importance calculated?
The feature importance is a direct artifact of the underlying modeling process. The importance values in the RF Model CSV file reflect the relative importance of each feature (attribute) in computing the model score.
How is missing value imputation handled?
For numerical attributes, a value is imputed based upon the similar cohort conversion behavior. The conversion rate for the missing-value cohort is compared to conversions of other cohorts, and the best match value is imputed.
For categorical values (strings), a Null or an empty field is simply treated as a value.
Should I remove unmatched records from my training file before uploading it?
Removal of unmatched records is optional in modeling. The platform will automatically downsample them if there are too many such records in the training file. This can yield better model scores on new records. If you want to remove some or all of unmatched records from your file upfront, see below for recommendations on removing records based on Rev.Up field values.
How do I use the Remodel feature?
Remodeling trains a new model iteration using the same training data file and any changes in attribute selection or model training settings.
This functionality allows you to refresh your view of the Rev.Up Data Cloud without having to re-export and prepare the training file. Thus, you have the option to quickly leverage our incremental data updates which come every month or so.
When you use custom attributes in your models, Remodel also adds some additional insights on how safe those attributes are for scoring.
What metric is used to optimize the model?
The model is optimized for the segmented lift of the conversion rate. This means that we look for high conversion rate for the top 20% of accounts, as well as a smooth, decreasing shape of the lift chart.
Is oversampling or undersampling used during training?
Rev.Up does not oversample your success events, as this may cause overfitting. In a few cases, such as for unmatched accounts, non-success events may be undersampled.
Does Rev.Up compare model performance on training and test datasets help identify overfitting?
Yes, Rev.Up scores both training and test datasets, and provides visualizations for the test dataset. This is a part of the diagnostics created for every model.
FAQ about LPI models
Can you download the PMML for a model?
No. Although LPI uses PMML internally in our model expression, this format is not available for download (except for cases where PMML was uploaded first).
Admin users in LPI do have access to the Model Summary page for their models, where a JSON expression of the model and many other artifacts can be found.
How do I remove spam-suppressing attributes generated by Rev.Up from my model?
Using or not using spam indicators is a first-class choice in the model creation flow. It is on by default for LPI Lead models, where user-filled data is frequent. If you do not want to use this while creating an LPI model, click Advanced Options in the CSV Upload screen and uncheck the “Enable Transformations” checkbox.
How do I identify records in my training data based on Rev.Up field values?
To remove records from your training data based on Rev.Up field values, try first running your file through the Score and Enrich flow in the model (flat file scoring). In LPI, open the model, click the Scoring icon on the left, then click the Score a File button. You can enrich with the firmographic fields you are interested in, including the “Is Matched” field.
Comments
0 comments
Please sign in to leave a comment.