Deep Quantile Regression | by Sachin Abeywardana

文章推薦指數: 80 %
投票人數:10人

This loss function consistently estimates the median (50th percentile), instead of the mean. Modelling in Keras. The forward model is no ... OpeninappHomeNotificationsListsStoriesWritePublishedinTowardsDataScienceDeepQuantileRegressionOneareathatDeepLearninghasnotexploredextensivelyistheuncertaintyinestimates.MostDeepLearningframeworkscurrentlyfocusongivingabestestimateasdefinedbyalossfunction.Occasionallysomethingbeyondapointestimateisrequiredtomakeadecision.Thisiswhereadistributionwouldbeuseful.Bayesianstatisticslendsitselftothisproblemreallywellsinceadistributionoverthedatasetisinferred.However,Bayesianmethodssofarhavebeenratherslowandwouldbeexpensivetoapplytolargedatasets.Asfarasdecisionmakinggoes,mostpeopleactuallyrequirequantilesasopposedtotrueuncertaintyinanestimate.Forinstancewhenmeasuringachild’sweightforagivenage,theweightofanindividualwillvary.Whatwouldbeinterestingis(forargumentssake)the10thand90thpercentile.NotethattheuncertaintyisdifferenttoquantilesinthatIcouldrequestforaconfidenceintervalonthe90thquantile.Thisarticlewillpurelyfocusoninferringquantiles.QuantileRegressionLossfunctionInregressionthemostcommonlyusedlossfunctionisthemeansquarederrorfunction.Ifweweretotakethenegativeofthislossandexponentiateit,theresultwouldcorrespondtothegaussiandistribution.Themodeofthisdistribution(thepeak)correspondstothemeanparameter.Hence,whenwepredictusinganeuralnetthatminimisedthislosswearepredictingthemeanvalueoftheoutputwhichmayhavebeennoisyinthetrainingset.ThelossinQuantileRegressionforanindividualdatapointisdefinedas:Lossofindividualdatapointwherealphaistherequiredquantile(avaluebetween0and1)andwheref(x)isthepredicted(quantile)modelandyistheobservedvalueforthecorrespondinginputx.Theaveragelossovertheentiredatasetisshownbelow:LossfuntionIfweweretotakethenegativeoftheindividuallossandexponentiateit,wegetthedistributionknowastheAsymmetricLaplacedistribution,shownbelow.Thereasonthatthislossfunctionworksisthatifweweretofindtheareaunderthegraphtotheleftofzeroitwouldbealpha,therequiredquantile.probabilitydistributionfunction(pdf)ofanAsymmetricLaplacedistribution.Thecasewhenalpha=0.5ismostlikelymorefamiliarsinceitcorrespondstotheMeanAbsoluteError(MAE).Thislossfunctionconsistentlyestimatesthemedian(50thpercentile),insteadofthemean.ModellinginKerasTheforwardmodelisnodifferenttowhatyouwouldhavehadwhendoingMSEregression.Allthatchangesisthelossfunction.Thefollowingfewlinesdefinesthelossfunctiondefinedinthesectionabove.importkeras.backendasKdeftilted_loss(q,y,f):e=(y-f)returnK.mean(K.maximum(q*e,(q-1)*e),axis=-1)Whenitcomestocompilingtheneuralnetwork,justsimplydo:quantile=0.5model.compile(loss=lambday,f:tilted_loss(quantile,y,f),optimizer='adagrad')ForafullexampleseethisJupyternotebookwhereIlookatamotorcyclecrashdatasetovertime.TheresultsarereproducedbelowwhereIshowthe10th50thand90thquantiles.Accelerationovertimeofcrashedmotorcycle.FinalNotesNotethatforeachquantileIhadtorerunthetraining.Thisisduetothefactthatforeachquantilethelossfunctionisdifferent,asthequantileinitselfisaparameterforthelossfunction.Duetothefactthateachmodelisasimplererun,thereisariskofquantilecrossover.i.e.the49thquantilemaygoabovethe50thquantileatsomestage.Notethatthequantile0.5isthesameasmedian,whichyoucanattainbyminimisingMeanAbsoluteError,whichyoucanattaininKerasregardlesswithloss='mae'.Uncertaintyandquantilesarenotthesamething.Butmostofthetimeyoucareaboutquantilesandnotuncertainty.Ifyoureallydowantuncertaintywithdeepnetscheckouthttp://mlg.eng.cam.ac.uk/yarin/blog_3d801aa532c1ce.htmlEdit1:AspointedoutbyAndersChristiansen(intheresponses)wemaybeabletogetmultiplequantilesinonegobyhavingmultipleobjectives.Kerashowevercombinesalllossfunctionsbyaloss_weightsargumentasshownhere:https://keras.io/getting-started/functional-api-guide/#multi-input-and-multi-output-models.Wouldbeeasiertoimplementthisintensorflow.Ifanyonebeatsmetoitwouldbehappytochangemynotebook/posttoreflectthis.Asaroughguideifwewantedthequantiles0.1,0.5,0.9,thelastlayerinKeraswouldhaveDense(3)instead,witheachnodeconnectedtoalossfunction.Edit2ThankstoJacobZweigforimplementingthesimultaneousmultipleQuantilesinTensorFlow:https://github.com/strongio/quantile-regression-tensorflow/blob/master/Quantile%20Loss.ipynbAfterThoughts:ThesearesomeofmythoughtsthatI’maddingonayearonfromwhenthearticlewaswrittenoriginally.MultipleQuantiles:ThemoreIthinkaboutit,themorethatIamconvincedthatyoushoulddoallquantilesinonego.Forexampleifwerequiredquantiles,0.05,0.5and0.95,have3outputnodes,witheachnodehavingadifferentlossfunction(whichissummeduptogetthefinalloss).Thisensuresthatthestructureofthedataissharedinthefirstfewlayers.Quantilecrossover:Tomakesurethatweavoidthis,thenon-medianquantileforinstance,0.95,canbemodelledas(node0.5+sigma*sigmoid(0.95node)).Wheresigmaisthemaximumwewouldexpectthe0.95quantiletodeviateawayfromthemedian.Similarideacanbeenforcedon0.05quantilewithanegativesignonsigma*sigmoid(0.05node)instead.SeehereformycourseonMachineLearningandDeepLearning(UsecodeDEEPSCHOOL-MARCHto90%off).MorefromTowardsDataScienceFollowYourhomefordatascience.AMediumpublicationsharingconcepts,ideasandcodes.ReadmorefromTowardsDataScienceRecommendedfromMediumColeHageninTowardsDataScienceCreateaRepeatedMeasuresRaincloudJorgeCunha10MistakestoavoidinGoogleAnalyticsTeejayAbaminINST414:DataScienceTechniquesNutritioninCommonThanksgivingFoodsnasitakeoffLet'shavealookatnewlearningaboutDataScienceProjectsCanerDabakogluWineReviewsVisualizationandNaturalLanguageProcess(NLP)ZaynaibGiwavisFest2019CndroinFAUNPublicationPivotDatainTableau|3EasyMethodsCourseProbeMachineLearningA-Z:Hands-OnPython&RInDataScienceReviewAboutHelpTermsPrivacyGettheMediumappGetstartedSachinAbeywardana933FollowersPhDinMachineLearning|FounderofDeepSchool.ioFollowHelpStatusWritersBlogCareersPrivacyTermsAboutKnowable



請為這篇文章評分?