Losses - Keras

2024-11-17

文章推薦指數： 80 %

投票人數：10人

Losses. The purpose of loss functions is to compute the quantity that a model should seek to minimize during training. Star AboutKeras Gettingstarted Developerguides KerasAPIreference ModelsAPI LayersAPI CallbacksAPI Optimizers Metrics Losses Dataloading Built-insmalldatasets KerasApplications Mixedprecision Utilities KerasTuner KerasCV KerasNLP Codeexamples WhychooseKeras? Community&governance ContributingtoKeras KerasTuner KerasCV KerasNLP search » KerasAPIreference/ Losses Losses Thepurposeoflossfunctionsistocomputethequantitythatamodelshouldseek tominimizeduringtraining. Availablelosses Notethatalllossesareavailablebothviaaclasshandleandviaafunctionhandle. Theclasshandlesenableyoutopassconfigurationargumentstotheconstructor (e.g. loss_fn=CategoricalCrossentropy(from_logits=True)), andtheyperformreductionbydefaultwhenusedinastandaloneway(seedetailsbelow). Probabilisticlosses BinaryCrossentropyclass CategoricalCrossentropyclass SparseCategoricalCrossentropyclass Poissonclass binary_crossentropyfunction categorical_crossentropyfunction sparse_categorical_crossentropyfunction poissonfunction KLDivergenceclass kl_divergencefunction Regressionlosses MeanSquaredErrorclass MeanAbsoluteErrorclass MeanAbsolutePercentageErrorclass MeanSquaredLogarithmicErrorclass CosineSimilarityclass mean_squared_errorfunction mean_absolute_errorfunction mean_absolute_percentage_errorfunction mean_squared_logarithmic_errorfunction cosine_similarityfunction Huberclass huberfunction LogCoshclass log_coshfunction Hingelossesfor"maximum-margin"classification Hingeclass SquaredHingeclass CategoricalHingeclass hingefunction squared_hingefunction categorical_hingefunction Usageoflosseswithcompile()&fit() AlossfunctionisoneofthetwoargumentsrequiredforcompilingaKerasmodel: fromtensorflowimportkeras fromtensorflow.kerasimportlayers model=keras.Sequential() model.add(layers.Dense(64,kernel_initializer='uniform',input_shape=(10,))) model.add(layers.Activation('softmax')) loss_fn=keras.losses.SparseCategoricalCrossentropy() model.compile(loss=loss_fn,optimizer='adam') Allbuilt-inlossfunctionsmayalsobepassedviatheirstringidentifier: #passoptimizerbyname:defaultparameterswillbeused model.compile(loss='sparse_categorical_crossentropy',optimizer='adam') Lossfunctionsaretypicallycreatedbyinstantiatingalossclass(e.g.keras.losses.SparseCategoricalCrossentropy). Alllossesarealsoprovidedasfunctionhandles(e.g.keras.losses.sparse_categorical_crossentropy). Usingclassesenablesyoutopassconfigurationargumentsatinstantiationtime,e.g.: loss_fn=keras.losses.SparseCategoricalCrossentropy(from_logits=True) Standaloneusageoflosses Alossisacallablewithargumentsloss_fn(y_true,y_pred,sample_weight=None): y_true:Groundtruthvalues,ofshape(batch_size,d0,...dN).For sparselossfunctions,suchassparsecategoricalcrossentropy,theshape shouldbe(batch_size,d0,...dN-1) y_pred:Thepredictedvalues,ofshape(batch_size,d0,..dN). sample_weight:Optionalsample_weightactsasreductionweighting coefficientfortheper-samplelosses.Ifascalarisprovided,thenthelossis simplyscaledbythegivenvalue.Ifsample_weightisatensorofsize [batch_size],thenthetotallossforeachsampleofthebatchis rescaledbythecorrespondingelementinthesample_weightvector.If theshapeofsample_weightis(batch_size,d0,...dN-1)(orcanbe broadcastedtothisshape),theneachlosselementofy_predisscaled bythecorrespondingvalueofsample_weight.(NoteondN-1:allloss functionsreduceby1dimension,usuallyaxis=-1.) Bydefault,lossfunctionsreturnonescalarlossvalueperinputsample,e.g. >>>tf.keras.losses.mean_squared_error(tf.ones((2,2,)),tf.zeros((2,2))) However,lossclassinstancesfeatureareductionconstructorargument, whichdefaultsto"sum_over_batch_size"(i.e.average).Allowablevaluesare "sum_over_batch_size","sum",and"none": "sum_over_batch_size"meansthelossinstancewillreturntheaverage oftheper-samplelossesinthebatch. "sum"meansthelossinstancewillreturnthesumoftheper-samplelossesinthebatch. "none"meansthelossinstancewillreturnthefullarrayofper-samplelosses. >>>loss_fn=tf.keras.losses.MeanSquaredError(reduction='sum_over_batch_size') >>>loss_fn(tf.ones((2,2,)),tf.zeros((2,2))) >>>loss_fn=tf.keras.losses.MeanSquaredError(reduction='sum') >>>loss_fn(tf.ones((2,2,)),tf.zeros((2,2))) >>>loss_fn=tf.keras.losses.MeanSquaredError(reduction='none') >>>loss_fn(tf.ones((2,2,)),tf.zeros((2,2))) Notethatthisisanimportantdifferencebetweenlossfunctionsliketf.keras.losses.mean_squared_error anddefaultlossclassinstancesliketf.keras.losses.MeanSquaredError:thefunctionversion doesnotperformreduction,butbydefaulttheclassinstancedoes. >>>loss_fn=tf.keras.losses.mean_squared_error >>>loss_fn(tf.ones((2,2,)),tf.zeros((2,2))) >>>loss_fn=tf.keras.losses.MeanSquaredError() >>>loss_fn(tf.ones((2,2,)),tf.zeros((2,2))) Whenusingfit(),thisdifferenceisirrelevantsincereductionishandledbytheframework. Here'showyouwouldusealossclassinstanceaspartofasimpletrainingloop: loss_fn=tf.keras.losses.CategoricalCrossentropy(from_logits=True) optimizer=tf.keras.optimizers.Adam() #Iterateoverthebatchesofadataset. forx,yindataset: withtf.GradientTape()astape: logits=model(x) #Computethelossvalueforthisbatch. loss_value=loss_fn(y,logits) #Updatetheweightsofthemodeltominimizethelossvalue. gradients=tape.gradient(loss_value,model.trainable_weights) optimizer.apply_gradients(zip(gradients,model.trainable_weights)) Creatingcustomlosses Anycallablewiththesignatureloss_fn(y_true,y_pred) thatreturnsanarrayoflosses(oneofsampleintheinputbatch)canbepassedtocompile()asaloss. Notethatsampleweightingisautomaticallysupportedforanysuchloss. Here'sasimpleexample: defmy_loss_fn(y_true,y_pred): squared_difference=tf.square(y_true-y_pred) returntf.reduce_mean(squared_difference,axis=-1)#Notethe`axis=-1` model.compile(optimizer='adam',loss=my_loss_fn) Theadd_loss()API Lossfunctionsappliedtotheoutputofamodelaren'ttheonlywayto createlosses. Whenwritingthecallmethodofacustomlayerorasubclassedmodel, youmaywanttocomputescalarquantitiesthatyouwanttominimizeduring training(e.g.regularizationlosses).Youcanusetheadd_loss()layermethod tokeeptrackofsuchlossterms. Here'sanexampleofalayerthataddsasparsityregularizationlossbasedontheL2normoftheinputs: fromtensorflow.keras.layersimportLayer classMyActivityRegularizer(Layer): """Layerthatcreatesanactivitysparsityregularizationloss.""" def__init__(self,rate=1e-2): super(MyActivityRegularizer,self).__init__() self.rate=rate defcall(self,inputs): #Weuse`add_loss`tocreatearegularizationloss #thatdependsontheinputs. self.add_loss(self.rate*tf.reduce_sum(tf.square(inputs))) returninputs Lossvaluesaddedviaadd_losscanberetrievedinthe.losseslistpropertyofanyLayerorModel (theyarerecursivelyretrievedfromeveryunderlyinglayer): fromtensorflow.kerasimportlayers classSparseMLP(Layer): """StackofLinearlayerswithasparsityregularizationloss.""" def__init__(self,output_dim): super(SparseMLP,self).__init__() self.dense_1=layers.Dense(32,activation=tf.nn.relu) self.regularization=MyActivityRegularizer(1e-2) self.dense_2=layers.Dense(output_dim) defcall(self,inputs): x=self.dense_1(inputs) x=self.regularization(x) returnself.dense_2(x) mlp=SparseMLP(1) y=mlp(tf.ones((10,10))) print(mlp.losses)#Listcontainingonefloat32scalar Theselossesareclearedbythetop-levellayeratthestartofeachforwardpass--theydon'taccumulate. Solayer.lossesalwayscontainonlythelossescreatedduringthelastforwardpass. Youwouldtypicallyusetheselossesbysummingthembeforecomputingyourgradientswhenwritingatrainingloop. #Lossescorrespondtothe*last*forwardpass. mlp=SparseMLP(1) mlp(tf.ones((10,10))) assertlen(mlp.losses)==1 mlp(tf.ones((10,10))) assertlen(mlp.losses)==1#Noaccumulation. Whenusingmodel.fit(),suchlosstermsarehandledautomatically. Whenwritingacustomtrainingloop,youshouldretrievetheseterms byhandfrommodel.losses,likethis: loss_fn=tf.keras.losses.CategoricalCrossentropy(from_logits=True) optimizer=tf.keras.optimizers.Adam() #Iterateoverthebatchesofadataset. forx,yindataset: withtf.GradientTape()astape: #Forwardpass. logits=model(x) #Lossvalueforthisbatch. loss_value=loss_fn(y,logits) #Addextralosstermstothelossvalue. loss_value+=sum(model.losses) #Updatetheweightsofthemodeltominimizethelossvalue. gradients=tape.gradient(loss_value,model.trainable_weights) optimizer.apply_gradients(zip(gradients,model.trainable_weights)) Seetheadd_loss()documentationformoredetails. Losses ▻ Availablelosses Probabilisticlosses Regressionlosses Hingelossesfor"maximum-margin"classification ▻ Usageoflosseswithcompile()&fit() ▻ Standaloneusageoflosses ▻ Creatingcustomlosses ▻ Theadd_loss()API