Blog: Quantile loss function for machine learning

2024-11-17

文章推薦指數： 80 %

投票人數：10人

This post introduces the powerful quantile loss regression, gives an intuitive explanation of why it works and solves an example in Keras. Skiptocontent TechBlog Quantilelossfunctionformachinelearning Quantilelossfunctionformachinelearning Motivation Itisnotalwayssufficientforamachinelearningmodeltomakeaccuratepredictions.Formanycommericalapplications,itisequallyimportanttohaveameasureofthepredictionuncertainty.Werecentlyworkedonaprojectwherepredictionsweresubjecttohighuncertainty.Theclientrequiredfortheirdecisiontobedrivenbyboththepredictedmachinelearningoutputandameasureofthepotentialpredictionerror.Thequantileregressionlossfunctionsolvesthisandsimilarproblemsbyreplacingasinglevaluepredictionbypredictionintervals.Thispostintroducesthepowerfulquantilelossregression,givesanintuitiveexplanationofwhyitworksandsolvesanexampleinKeras. Thequantileregressionlossfunction Machinelearningmodelsworkbyminimizing(ormaximizing)anobjectivefunction.Anobjectivefunctiontranslatestheproblemwearetryingtosolveintoamathematicalformulatobeminimizedbythemodel.Asthenamesuggests,thequantileregressionlossfunctionisappliedtopredictquantiles.Aquantileisthevaluebelowwhichafractionofobservationsinagroupfalls.Forexample,apredictionforquantile0.9shouldover-predict90%ofthetimes.Givenapredictionyipandoutcomeyi,theregressionlossforaquantileqisL(yip, yi) = max[q(yi − yip), (q − 1)(yi − yip)]Forasetofpredictions,thelosswillbetheaverage.AmathematicalderivationoftheaboveformulacanbefoundinQuantileRegressionarticleinWikiWand.Ifyouareinterestedinanintuitiveexplanation,readthefollowingsection.Ifyouarejustlookingtoapplythequantilelossfunctiontoadeepneuralnetwork,skiptotheexamplesectionbelow. Intuitiveexplanation Let’sstarttheintuitiveexplanationbyconsideringthemostcommonlyusedquantile,themedian.Ifqissubstitutedwith0.5intheequationabove,themeanabsoluteerrorfunctionisobtainedwhichpredictsthemedian.Thisisequivalenttosayingthatthemeanabsoluteerrorlossfunctionhasitsminimumatthemedian.Asimpleexampleisprobablytheeasiestapproachtoexplainwhythisisthecase.Considerthreepointsonaverticallineatdifferentdistancesfromeachother:upperpoint,middlepointandlowerpoint.Inthisone-dimensionalexample,theabsoluteerroristhesameasthedistance.Thehypothesistobeconfirmedisthatthemeanabsoluteerrorisminimumatthemedian(middlepoint).Tocheckourhypothesis,wewillstartatthemiddlepointandmoveupwardgettingclosertotheupperpointbutfurther,bythesamedistance,toboththemiddleandlowerpoints.Thiswillobviouslyincreasethemeanabsoluteerror(i.e.meandistancetothethreepoints).Thesameappliesifmovingdownwards.Ourhypothesisisconfirmedasthemiddlepointisboththemedianandtheminimumofthemeanabsolutelossfunction.Ifinsteadofhavingasingleupperpointandasinglelowerpoint,wehadonehundredpointsaboveandbelow,oranyotherarbitrarynumber,theresultstillstands.Intheregressionlossequationabove,asqhasavaluebetween0and1,thefirsttermwillbepositiveanddominatewhenunder-predicting,yi > yip,andthesecondtermwilldominatewhenover-predicting,yi