Keras Loss Functions: Everything You Need to Know
文章推薦指數: 80 %
loss functions available in Keras and how to use them, ... import tensorflow_addons as tfa model.compile(optimizer='adam', loss=tfa.losses. WeRaised$8MSeriesAtoContinueBuildingExperimentTrackingandModelRegistryThat“JustWorks” Readmore Blog MLModelDevelopment MLOps MachineLearningTools ComputerVision NaturalLanguageProcessing ReinforcementLearning TabularData TimeSeries WanttoseamlesslytrackALLyourmodeltrainingmetadata(metrics,parameters,hardwareconsumption,etc.)? IntegrateTensorFlow/KeraswithNeptunein5mins. Checkhow You’vecreatedadeeplearningmodelinKeras,youpreparedthedataandnowyouarewonderingwhichlossyoushouldchooseforyourproblem. We’llgettothatinasecondbutfirstwhatisalossfunction? Indeeplearning,thelossiscomputedtogetthegradientswithrespecttomodelweightsandupdatethoseweightsaccordinglyviabackpropagation.Lossiscalculatedandthenetworkisupdatedaftereveryiterationuntilmodelupdatesdon’tbringanyimprovementinthedesiredevaluationmetric. Sowhileyoukeepusingthesameevaluationmetriclikef1scoreorAUConthevalidationsetduring(longparts)ofyourmachinelearningproject,thelosscanbechanged,adjustedandmodifiedtogetthebestevaluationmetricperformance. Youcanthinkofthelossfunctionjustlikeyouthinkaboutthemodelarchitectureortheoptimizeranditisimportanttoputsomethoughtintochoosingit.Inthispiecewe’lllookat: lossfunctionsavailableinKerasandhowtousethem,howyoucandefineyourowncustomlossfunctioninKeras,howtoaddsampleweighingtocreateobservation-sensitivelosses,howtoavoidnansintheloss,howyoucanmonitorthelossfunctionviaplottingandcallbacks. Let’sgetintoit! KerasLossfunctions101 InKeras,lossfunctionsarepassedduringthecompilestageasshownbelow. Inthisexample,we’redefiningthelossfunctionbycreatinganinstanceofthelossclass.Usingtheclassisadvantageousbecauseyoucanpasssomeadditionalparameters. fromtensorflowimportkeras fromtensorflow.kerasimportlayers model=keras.Sequential() model.add(layers.Dense(64,kernel_initializer='uniform',input_shape=(10,))) model.add(layers.Activation('softmax')) loss_function=keras.losses.SparseCategoricalCrossentropy(from_logits=True) model.compile(loss=loss_function,optimizer='adam') IfyouwanttousealossfunctionthatisbuiltintoKeraswithoutspecifyinganyparametersyoucanjustusethestringaliasasshownbelow: model.compile(loss='sparse_categorical_crossentropy',optimizer='adam') Youmightbewondering,howdoesonedecideonwhichlossfunctiontouse? TherearevariouslossfunctionsavailableinKeras.Othertimesyoumighthavetoimplementyourowncustomlossfunctions. Let’sdiveintoallthosescenarios. WhichlossfunctionsareavailableinKeras? BinaryClassification Binaryclassificationlossfunctioncomesintoplaywhensolvingaprobleminvolvingjusttwoclasses.Forexample,whenpredictingfraudincreditcardtransactions,atransactioniseitherfraudulentornot. BinaryCrossEntropy TheBinaryCrossentropywillcalculatethecross-entropylossbetweenthepredictedclassesandthetrueclasses.Bydefault,thesum_over_batch_sizereductionisused.Thismeansthatthelosswillreturntheaverageoftheper-samplelossesinthebatch. y_true=[[0.,1.],[0.2,0.8],[0.3,0.7],[0.4,0.6]] y_pred=[[0.6,0.4],[0.4,0.6],[0.6,0.4],[0.8,0.2]] bce=tf.keras.losses.BinaryCrossentropy(reduction='sum_over_batch_size') bce(y_true,y_pred).numpy() Thesumreductionmeansthatthelossfunctionwillreturnthesumoftheper-samplelossesinthebatch. bce=tf.keras.losses.BinaryCrossentropy(reduction='sum') bce(y_true,y_pred).numpy() Usingthereductionasnonereturnsthefullarrayoftheper-samplelosses. bce=tf.keras.losses.BinaryCrossentropy(reduction='none') bce(y_true,y_pred).numpy() array([0.9162905,0.5919184,0.79465103,1.0549198],dtype=float32) Inbinaryclassification,theactivationfunctionusedisthesigmoidactivationfunction.Itconstrainstheoutputtoanumberbetween0and1. Multiclassclassification Problemsinvolvingthepredictionofmorethanoneclassusedifferentlossfunctions.Inthissectionwe’lllookatacouple: CategoricalCrossentropy TheCategoricalCrossentropyalsocomputesthecross-entropylossbetweenthetrueclassesandpredictedclasses.Thelabelsaregiveninanone_hotformat. cce=tf.keras.losses.CategoricalCrossentropy() cce(y_true,y_pred).numpy() SparseCategoricalCrossentropy Ifyouhavetwoormoreclassesand thelabelsareintegers,theSparseCategoricalCrossentropyshouldbeused. y_true=[0,1,2] y_pred=[[0.05,0.95,0],[0.1,0.8,0.1],[0.1,0.8,0.1]] scce=tf.keras.losses.SparseCategoricalCrossentropy() scce(y_true,y_pred).numpy() ThePoisonLoss YoucanalsousethePoissonclasstocomputethepoisonloss.It’sagreatchoiceifyourdatasetcomesfromaPoissondistributionforexamplethenumberofcallsacallcenterreceivesperhour. y_true=[[0.1,1.,0.8],[0.1,0.9,0.1],[0.2,0.7,0.1],[0.3,0.1,0.6]] y_pred=[[0.6,0.2,0.2],[0.2,0.6,0.2],[0.7,0.1,0.2],[0.8,0.1,0.1]] p=tf.keras.losses.Poisson() p(y_true,y_pred).numpy() Kullback-LeiblerDivergenceLoss TherelativeentropycanbecomputedusingtheKLDivergenceclass.AccordingtotheofficialdocsatPyTorch: KLdivergenceisausefuldistancemeasureforcontinuousdistributionsandisoftenusefulwhenperformingdirectregressionoverthespaceof(discretelysampled)continuousoutputdistributions. y_true=[[0.1,1.,0.8],[0.1,0.9,0.1],[0.2,0.7,0.1],[0.3,0.1,0.6]] y_pred=[[0.6,0.2,0.2],[0.2,0.6,0.2],[0.7,0.1,0.2],[0.8,0.1,0.1]] kl=tf.keras.losses.KLDivergence() kl(y_true,y_pred).numpy() Inamulti-classproblem,theactivationfunctionusedisthesoftmaxfunction. ObjectDetection TheFocalLoss Inclassificationproblemsinvolvingimbalanceddataandobjectdetectionproblems,youcanusetheFocalLoss.Thelossintroducesanadjustmenttothecross-entropycriterion. Itisdonebyalteringitsshapeinawaythatthelossallocatedtowell-classifiedexamplesisdown-weighted.Thisensuresthatthemodelisabletolearnequallyfromminorityandmajorityclasses. Thecross-entropylossisscaledbyscalingthefactorsdecayingatzeroastheconfidenceinthecorrectclassincreases.Thefactorofscalingdownweightsthecontributionofunchallengingsamplesattrainingtimeandfocusesonthechallengingones. importtensorflow_addonsastfa y_true=[[0.97],[0.91],[0.03]] y_pred=[[1.0],[1.0],[0.0]] sfc=tfa.losses.SigmoidFocalCrossEntropy() sfc(y_true,y_pred).numpy() array([0.00010971,0.00329749,0.00030611],dtype=float32) GeneralizedIntersectionoverUnion TheGeneralizedIntersectionoverUnionlossfromtheTensorFlowaddoncanalsobeused.TheIntersectionoverUnion(IoU)isaverycommonmetricinobjectdetectionproblems.IoUishowevernotveryefficientinproblemsinvolvingnon-overlappingboundingboxes. TheGeneralizedIntersectionoverUnionwasintroducedtoaddressthischallengethatIoUisfacing.Itensuresthatgeneralizationisachievedbymaintainingthescale-invariantpropertyofIoU,encodingtheshapepropertiesofthecomparedobjectsintotheregionproperty,andmakingsurethatthereisastrongcorrelationwithIoUintheeventofoverlappingobjects. gl=tfa.losses.GIoULoss() boxes1=tf.constant([[4.0,3.0,7.0,5.0],[5.0,6.0,10.0,7.0]]) boxes2=tf.constant([[3.0,4.0,6.0,8.0],[14.0,14.0,15.0,15.0]]) loss=gl(boxes1,boxes2) Regression Inregressionproblems,youhavetocalculatethedifferencesbetweenthepredictedvaluesandthetruevaluesbutasalwaystherearemanywaystodoit. MeanSquaredError TheMeanSquaredErrorclasscanbeusedtocomputethemeansquareoferrorsbetweenthepredictionsandthetruevalues. y_true=[12,20,29.,60.] y_pred=[14.,18.,27.,55.] mse=tf.keras.losses.MeanSquaredError() mse(y_true,y_pred).numpy() UseMeanSquaredErrorwhenyoudesiretohavelargeerrorspenalizedmorethansmallerones. MeanAbsolutePercentageError Themeanabsolutepercentageerroriscomputedusingthefunctionbelow. Itiscalculatedasshownbelow. y_true=[12,20,29.,60.] y_pred=[14.,18.,27.,55.] mape=tf.keras.losses.MeanAbsolutePercentageError() mape(y_true,y_pred).numpy() Considerusingthislosswhenyouwantalossthatyoucanexplainintuitively.Peopleunderstandpercentageseasily.Thelossisalsorobusttooutliers. MeanSquaredLogarithmicError Themeansquaredlogarithmicerrorcanbecomputedusingtheformulabelow: Here’sanimplementationofthesame: y_true=[12,20,29.,60.] y_pred=[14.,18.,27.,55.] msle=tf.keras.losses.MeanSquaredLogarithmicError() msle(y_true,y_pred).numpy() MeanSquaredLogarithmicErrorpenalizesunderestimatesmorethanitdoesoverestimates.It’sagreatchoicewhenyouprefernottopenalizelargeerrors,itis,therefore,robusttooutliers. CosineSimilarityLoss Ifyourinterestisincomputingthecosinesimilaritybetweenthetrueandpredictedvalues,you’dusetheCosineSimilarityclass.Itiscomputedas: Theresultisanumberbetween -1and1 .0indicatesorthogonalitywhilevaluescloseto-1showthatthereisgreatsimilarity. y_true=[[12,20],[29.,60.]] y_pred=[[14.,18.],[27.,55.]] cosine_loss=tf.keras.losses.CosineSimilarity(axis=1) cosine_loss(y_true,y_pred).numpy() LogCoshLoss TheLogCoshclasscomputesthelogarithmofthehyperboliccosineofthepredictionerror. Here’sitsimplementationasastand-alonefunction. y_true=[[12,20],[29.,60.]] y_pred=[[14.,18.],[27.,55.]] l=tf.keras.losses.LogCosh() l(y_true,y_pred).numpy() LogCoshLossworkslikethemeansquarederror,butwillnotbesostronglyaffectedbytheoccasionalwildlyincorrectprediction.—TensorFlowDocs Huberloss Forregressionproblemsthatarelesssensitivetooutliers,theHuberlossisused. y_true=[12,20,29.,60.] y_pred=[14.,18.,27.,55.] h=tf.keras.losses.Huber() h(y_true,y_pred).numpy() LearningEmbeddings TripletLoss Youcanalsocomputethetripletlosswithsemi-hardnegativeminingviaTensorFlowaddons.Thelossencouragesthepositivedistancesbetweenpairsofembeddingswiththesamelabelstobelessthantheminimumnegativedistance. importtensorflow_addonsastfa model.compile(optimizer='adam', loss=tfa.losses.TripletSemiHardLoss(), metrics=['accuracy']) CreatingcustomlossfunctionsinKeras Sometimesthereisnogoodlossavailableoryouneedtoimplementsomemodifications.Let’slearnhowtodothat. Acustomlossfunctioncanbecreatedbydefiningafunctionthattakesthetruevaluesandpredictedvaluesasrequiredparameters.Thefunctionshouldreturnanarrayoflosses.Thefunctioncanthenbepassedatthecompilestage. defcustom_loss_function(y_true,y_pred): squared_difference=tf.square(y_true-y_pred) returntf.reduce_mean(squared_difference,axis=-1) model.compile(optimizer='adam',loss=custom_loss_function) Let’sseehowwecanapplythiscustomlossfunctiontoanarrayofpredictedandtruevalues. importnumpyasnp y_true=[12,20,29.,60.] y_pred=[14.,18.,27.,55.] cl=custom_loss_function(np.array(y_true),np.array(y_pred)) cl.numpy() UseofKeraslossweights Duringthetrainingprocess,onecanweighthelossfunctionbyobservationsorsamples.Theweightscanbearbitrarybutatypicalchoiceareclassweights(distributionoflabels).Eachobservationisweightedbythefractionoftheclassitbelongsto(reversed)sothatthelossforminorityclassobservationsismoreimportantwhencalculatingtheloss. Oneofthewaysfordoingthisispassingtheclassweightsduringthetrainingprocess. Theweightsarepassedusingadictionarythatcontainstheweightforeachclass.YoucancomputetheweightsusingScikit-learnorcalculatetheweightsbasedonyourowncriterion. weights={0:1.01300017,1:0.88994364,2:1.00704935,3:0.97863318,4:1.02704553,5:1.10680686,6:1.01385603,7:0.95770152,8:1.02546573, 9:1.00857287} model.fit(x_train,y_train,verbose=1,epochs=10,class_weight=weights) Thesecondwayistopasstheseweightsatthecompilestage. weights=[1.013,0.889,1.007,0.978,1.027,1.106,1.013,0.957,1.025,1.008] model.compile(optimizer=tf.keras.optimizers.SGD(), loss=tf.keras.losses.SparseCategoricalCrossentropy(), loss_weights=weights, metrics=['accuracy']) HowtomonitorKeraslossfunction[example] It isusuallyagoodideatomonitorthelossfunction,onthetrainingandvalidationsetasthemodelistraining.Lookingatthoselearningcurvesisagoodindicationofoverfittingorotherproblemswithmodeltraining. Source Therearetwomainoptionsofhowthiscanbedone. MonitorKeraslossusingconsolelogs Thequickestandeasiestwaytologandlookatthelossesissimplyprintingthemtotheconsole. importtensorflowastf mnist=tf.keras.datasets.mnist (x_train,y_train),(x_test,y_test)=mnist.load_data() x_train,x_test=x_train/255.0,x_test/255.0 model=tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28,28)), tf.keras.layers.Dense(512,activation='relu'), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10,activation='softmax') ]) model.compile(optimizer='sgd', loss='sparse_categorical_crossentropy', metrics=['accuracy']) model.fit(x_train,y_train,verbose=1,epochs=10) Theproblemwiththisapproachisthatthoselogscanbeeasilylost,itisdifficulttoseeprogressandwhenworkingonremotemachinesyoumaynothaveaccesstoit. MonitorKeraslossusingacallback Another,cleaneroptionistouseacallbackwhichwilllogthelosssomewhereoneverybatchandepochend. Youneedtodecidewhereandwhatyouwouldliketologbutitisreallysimple. ForexampleloggingkeraslosstoNeptunecouldlooklikethis: fromkeras.callbacksimportCallback classNeptuneCallback(Callback): defon_batch_end(self,batch,logs=None): formetric_name,metric_valueinlogs.items(): neptune_run[f"{metric_name}"].log(metric_value) defon_epoch_end(self,epoch,logs=None): formetric_name,metric_valueinlogs.items(): neptune_run[f"{metric_name}"].log(metric_value) Youcancreatethemonitoringcallbackyourselforuseoneofthemanyavailablekerascallbacksbothinthekeraslibraryandinotherlibrariesthatintegratewithit,likeTensorBoard,Neptuneandothers. Onceyouhavethecallbackreadyyousimplypassittothemodel.fit(...): pipinstallneptune-clientneptune-tensorboard #thesameasabove importneptune.newasneptune fromneptunecontrib.monitoring.kerasimportNeptuneMonitor run=neptune.init( project='common/tf-keras-integration', api_token='ANONYMOUS' ) neptune_cbk=NeptuneCallback(run=run,base_namespace='metrics') model.fit(x_train,y_train, validation_split=0.2, epochs=10, callbacks=[neptune_cbk]) AndmonitoryourexperimentlearningcurvesintheUI: KerastrainingdashboardinNeptune WhyKeraslossnanhappens MostofthetimelossesyoulogwillbejustsomeregularvaluesbutsometimesyoumightgetnanswhenworkingwithKeraslossfunctions. Whenthathappensyourmodelwillnotupdateitsweightsandwillstoplearningsothissituationneedstobeavoided. Therecouldbemanyreasonsfornanlossbutusuallywhathappensis: nansinthetrainingsetwillleadtonansintheloss,NumPyinfiniteinthetrainingsetwillalsoleadtonansintheloss,Usingatrainingsetthatisnotscaled,Useofaverylargel2regularizersandalearningrateabove1,Useofthewrongoptimizerfunction,Large(exploding)gradientsthatresultinalargeupdatetonetworkweightsduringtraining. Soinordertoavoidnansintheloss,ensurethat: Checkthatyourtrainingdataisproperlyscaledanddoesn’tcontainnans;Checkthatyouareusingtherightoptimizerandthatyourlearningrateisnottoolarge;Checkwhetherthel2regularizationisnottoolarge;Ifyouarefacingtheexplodinggradientproblemyoucaneither:re-designthenetworkorusegradientclippingsothatyourgradientshaveacertain“maximumallowedmodelupdate”. Finalthoughts Hopefully,thisarticlegaveyousomebackgroundintolossfunctionsinKeras. We’vecovered: Built-inlossfunctionsinKeras,Implementationofyourowncustomlossfunctions,Howtoaddsampleweighingtocreateobservation-sensitivelosses,Howtoavoidlossnans,Howyoucanvisualizelossasyourmodelistraining. FormoreinformationcheckouttheKerasRepositoryandtheTensorFlowLossFunctionsdocumentation. DerrickMwiti DerrickMwitiisadatascientistwhohasagreatpassionforsharingknowledge.HeisanavidcontributortothedatasciencecommunityviablogssuchasHeartbeat,TowardsDataScience,Datacamp,NeptuneAI,KDnuggetsjusttomentionafew.Hiscontenthasbeenviewedoveramilliontimesontheinternet.Derrickisalsoanauthorandonlineinstructor.Healsotrainsandworkswithvariousinstitutionstoimplementdatasciencesolutionsaswellastoupskilltheirstaff.YoumightwanttocheckhisCompleteDataScience&MachineLearningBootcampinPythoncourse. Followmeon READNEXT KerasMetrics:EverythingYouNeedToKnow 10minsread|AuthorDerrickMwiti|UpdatedJune8th,2021 Kerasmetricsarefunctionsthatareusedtoevaluatetheperformanceofyourdeeplearningmodel.Choosingagoodmetricforyourproblemisusuallyadifficulttask. youneedtounderstand whichmetricsarealreadyavailable inKerasandtf.kerasandhowtousethem,inmanysituationsyouneedto defineyourowncustommetric becausethemetricyouarelookingfordoesn’tshipwithKeras.sometimesyouwantto monitormodelperformancebylookingatchartslikeROCcurveorConfusionMatrix aftereveryepoch. Sometermsthatwillbeexplainedinthisarticle: kerasmetricsaccuracykerascompilemetricskerascustommetrickerasmetricsforregressionkerasconfusionmatrixtf.kerac.metrics.meanioutf.keras.metricsf1scoretf.keras.metrics.auc Continuereading-> ApplicationsofAIinDroneTechnology:BuildingMachineLearningModelsThatWorkonDrones(WithTensorFlow/Keras) byBrainJohn Readmore KerasMetrics:EverythingYouNeedtoKnow byDerrickMwiti Readmore DeepDiveIntoTensorBoard:TutorialWithExamples byDerrickMwiti Readmore PyTorchLossFunctions:TheUltimateGuide byAlfrickOpidi,AbhishekJha Readmore Neptune.aiusescookiestoensureyougetthebestexperienceonthiswebsite.Bycontinuingyouagreetoouruseofcookies.LearnmoreGotit!Manageconsent Close PrivacyOverview Thiswebsiteusescookiestoimproveyourexperiencewhileyounavigatethroughthewebsite.Outofthese,thecookiesthatarecategorizedasnecessaryarestoredonyourbrowserastheyareessentialfortheworkingofbasicfunctionalitiesofthewebsite.Wealsousethird-partycookiesthathelpusanalyzeandunderstandhowyouusethiswebsite.Thesecookieswillbestoredinyourbrowseronlywithyourconsent.Youalsohavetheoptiontoopt-outofthesecookies.Butoptingoutofsomeofthesecookiesmayaffectyourbrowsingexperience. Necessary Necessary AlwaysEnabled Necessarycookiesareabsolutelyessentialforthewebsitetofunctionproperly.Thesecookiesensurebasicfunctionalitiesandsecurityfeaturesofthewebsite,anonymously. CookieDurationDescriptioncookielawinfo-checbox-analytics11monthsThiscookieissetbyGDPRCookieConsentplugin.Thecookieisusedtostoretheuserconsentforthecookiesinthecategory"Analytics".cookielawinfo-checbox-functional11monthsThecookieissetbyGDPRcookieconsenttorecordtheuserconsentforthecookiesinthecategory"Functional".cookielawinfo-checbox-others11monthsThiscookieissetbyGDPRCookieConsentplugin.Thecookieisusedtostoretheuserconsentforthecookiesinthecategory"Other.cookielawinfo-checkbox-necessary11monthsThiscookieissetbyGDPRCookieConsentplugin.Thecookiesisusedtostoretheuserconsentforthecookiesinthecategory"Necessary".cookielawinfo-checkbox-performance11monthsThiscookieissetbyGDPRCookieConsentplugin.Thecookieisusedtostoretheuserconsentforthecookiesinthecategory"Performance".viewed_cookie_policy11monthsThecookieissetbytheGDPRCookieConsentpluginandisusedtostorewhetherornotuserhasconsentedtotheuseofcookies.Itdoesnotstoreanypersonaldata. Functional Functional Functionalcookieshelptoperformcertainfunctionalitieslikesharingthecontentofthewebsiteonsocialmediaplatforms,collectfeedbacks,andotherthird-partyfeatures. Performance Performance Performancecookiesareusedtounderstandandanalyzethekeyperformanceindexesofthewebsitewhichhelpsindeliveringabetteruserexperienceforthevisitors. Analytics Analytics Analyticalcookiesareusedtounderstandhowvisitorsinteractwiththewebsite.Thesecookieshelpprovideinformationonmetricsthenumberofvisitors,bouncerate,trafficsource,etc. Advertisement Advertisement Advertisementcookiesareusedtoprovidevisitorswithrelevantadsandmarketingcampaigns.Thesecookiestrackvisitorsacrosswebsitesandcollectinformationtoprovidecustomizedads. Others Others Otheruncategorizedcookiesarethosethatarebeinganalyzedandhavenotbeenclassifiedintoacategoryasyet. SAVE&ACCEPT LoadingComments... WriteaComment... Email(Required) Name(Required) Website
延伸文章資訊
- 1addons/giou_loss.py at master · tensorflow/addons - GitHub
model = tf.keras.Model(). >>> model.compile('sgd', loss=tfa.losses.GIoULoss()). Args: mode: one o...
- 2tfa.losses.GIoULoss | TensorFlow Addons
tfa.losses.GIoULoss ... Implements the GIoU loss function. ... GIoU loss was first introduced in ...
- 3tfaddons.pdf
losses, optimizers, and more. ... See the docstring for tfa.seq2seq.monotonic_attention for more ...
- 4Generalized Intersection over Union
Generalized Intersection over Union. A Metric and A Loss for Bounding Box Regression. Cite Paper ...
- 5Giou loss tf代码 - CSDN博客
loss: a float tensor of shape [batch_size, num_anchors] tensor ... IOU_loss、DIOU_loss、GIOU_loss、C...