pyltr is a Python learning-to-rank toolkit with ranking models, evaluation metrics, data wrangling helpers, and more.
This software is licensed under the BSD 3-clause license (see LICENSE.txt
).
The author may be contacted at ma127jerry <@t> gmail
with general
feedback, questions, or bug reports.
Import pyltr:
import pyltr
Import a LETOR dataset (e.g. MQ2007 ):
with open('train.txt') as trainfile, \ open('vali.txt') as valifile, \ open('test.txt') as evalfile: TX, Ty, Tqids, _ = pyltr.data.letor.read_dataset(trainfile) VX, Vy, Vqids, _ = pyltr.data.letor.read_dataset(valifile) EX, Ey, Eqids, _ = pyltr.data.letor.read_dataset(evalfile)
Train a LambdaMART model, using validation set for early stopping and trimming:
metric = pyltr.metrics.NDCG(k=10) # Only needed if you want to perform validation (early stopping & trimming) monitor = pyltr.models.monitors.ValidationMonitor( VX, Vy, Vqids, metric=metric, stop_after=250) model = pyltr.models.LambdaMART( metric=metric, n_estimators=1000, learning_rate=0.02, max_features=0.5, query_subsample=0.5, max_leaf_nodes=10, min_samples_leaf=64, verbose=1, ) model.fit(TX, Ty, Tqids, monitor=monitor)
Evaluate model on test data:
Epred = model.predict(EX) print 'Random ranking:', metric.calc_mean_random(Eqids, Ey) print 'Our model:', metric.calc_mean(Eqids, Ey, Epred)
Below are some of the features currently implemented in pyltr.
pyltr.models.LambdaMART
)pyltr.metrics.DCG
, pyltr.metrics.NDCG
)pyltr.metrics.ERR
)pyltr.metrics.AP
)pyltr.metrics.KendallTau
)pyltr.metrics.AUCROC
)pyltr.data.letor.read
)pyltr.util.group.check_qids
, pyltr.util.group.get_groups
)Use the run_tests.sh
script to run all unit tests.
cd
into the docs/
directory and run make html
. Docs are generated
in the docs/_build
directory.
Quality contributions or bugfixes are gratefully accepted. When submitting a
pull request, please update AUTHOR.txt
so you can be recognized for your
work :).
By submitting a Github pull request, you consent to have your submitted code
released under the terms of the project's license (see LICENSE.txt
).
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。