Losses. The purpose of loss functions is to compute the quantity that a model should seek to minimize during training.
Star
AboutKeras
Gettingstarted
Developerguides
KerasAPIreference
ModelsAPI
LayersAPI
CallbacksAPI
Optimizers
Metrics
Losses
Dataloading
Built-insmalldatasets
KerasApplications
Mixedprecision
Utilities
KerasTuner
KerasCV
KerasNLP
Codeexamples
WhychooseKeras?
Community&governance
ContributingtoKeras
KerasTuner
KerasCV
KerasNLP
search
»
KerasAPIreference/
Losses
Losses
Thepurposeoflossfunctionsistocomputethequantitythatamodelshouldseek
tominimizeduringtraining.
Availablelosses
Notethatalllossesareavailablebothviaaclasshandleandviaafunctionhandle.
Theclasshandlesenableyoutopassconfigurationargumentstotheconstructor
(e.g.
loss_fn=CategoricalCrossentropy(from_logits=True)),
andtheyperformreductionbydefaultwhenusedinastandaloneway(seedetailsbelow).
Probabilisticlosses
BinaryCrossentropyclass
CategoricalCrossentropyclass
SparseCategoricalCrossentropyclass
Poissonclass
binary_crossentropyfunction
categorical_crossentropyfunction
sparse_categorical_crossentropyfunction
poissonfunction
KLDivergenceclass
kl_divergencefunction
Regressionlosses
MeanSquaredErrorclass
MeanAbsoluteErrorclass
MeanAbsolutePercentageErrorclass
MeanSquaredLogarithmicErrorclass
CosineSimilarityclass
mean_squared_errorfunction
mean_absolute_errorfunction
mean_absolute_percentage_errorfunction
mean_squared_logarithmic_errorfunction
cosine_similarityfunction
Huberclass
huberfunction
LogCoshclass
log_coshfunction
Hingelossesfor"maximum-margin"classification
Hingeclass
SquaredHingeclass
CategoricalHingeclass
hingefunction
squared_hingefunction
categorical_hingefunction
Usageoflosseswithcompile()&fit()
AlossfunctionisoneofthetwoargumentsrequiredforcompilingaKerasmodel:
fromtensorflowimportkeras
fromtensorflow.kerasimportlayers
model=keras.Sequential()
model.add(layers.Dense(64,kernel_initializer='uniform',input_shape=(10,)))
model.add(layers.Activation('softmax'))
loss_fn=keras.losses.SparseCategoricalCrossentropy()
model.compile(loss=loss_fn,optimizer='adam')
Allbuilt-inlossfunctionsmayalsobepassedviatheirstringidentifier:
#passoptimizerbyname:defaultparameterswillbeused
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam')
Lossfunctionsaretypicallycreatedbyinstantiatingalossclass(e.g.keras.losses.SparseCategoricalCrossentropy).
Alllossesarealsoprovidedasfunctionhandles(e.g.keras.losses.sparse_categorical_crossentropy).
Usingclassesenablesyoutopassconfigurationargumentsatinstantiationtime,e.g.:
loss_fn=keras.losses.SparseCategoricalCrossentropy(from_logits=True)
Standaloneusageoflosses
Alossisacallablewithargumentsloss_fn(y_true,y_pred,sample_weight=None):
y_true:Groundtruthvalues,ofshape(batch_size,d0,...dN).For
sparselossfunctions,suchassparsecategoricalcrossentropy,theshape
shouldbe(batch_size,d0,...dN-1)
y_pred:Thepredictedvalues,ofshape(batch_size,d0,..dN).
sample_weight:Optionalsample_weightactsasreductionweighting
coefficientfortheper-samplelosses.Ifascalarisprovided,thenthelossis
simplyscaledbythegivenvalue.Ifsample_weightisatensorofsize
[batch_size],thenthetotallossforeachsampleofthebatchis
rescaledbythecorrespondingelementinthesample_weightvector.If
theshapeofsample_weightis(batch_size,d0,...dN-1)(orcanbe
broadcastedtothisshape),theneachlosselementofy_predisscaled
bythecorrespondingvalueofsample_weight.(NoteondN-1:allloss
functionsreduceby1dimension,usuallyaxis=-1.)
Bydefault,lossfunctionsreturnonescalarlossvalueperinputsample,e.g.
>>>tf.keras.losses.mean_squared_error(tf.ones((2,2,)),tf.zeros((2,2)))
However,lossclassinstancesfeatureareductionconstructorargument,
whichdefaultsto"sum_over_batch_size"(i.e.average).Allowablevaluesare
"sum_over_batch_size","sum",and"none":
"sum_over_batch_size"meansthelossinstancewillreturntheaverage
oftheper-samplelossesinthebatch.
"sum"meansthelossinstancewillreturnthesumoftheper-samplelossesinthebatch.
"none"meansthelossinstancewillreturnthefullarrayofper-samplelosses.
>>>loss_fn=tf.keras.losses.MeanSquaredError(reduction='sum_over_batch_size')
>>>loss_fn(tf.ones((2,2,)),tf.zeros((2,2)))
>>>loss_fn=tf.keras.losses.MeanSquaredError(reduction='sum')
>>>loss_fn(tf.ones((2,2,)),tf.zeros((2,2)))
>>>loss_fn=tf.keras.losses.MeanSquaredError(reduction='none')
>>>loss_fn(tf.ones((2,2,)),tf.zeros((2,2)))
Notethatthisisanimportantdifferencebetweenlossfunctionsliketf.keras.losses.mean_squared_error
anddefaultlossclassinstancesliketf.keras.losses.MeanSquaredError:thefunctionversion
doesnotperformreduction,butbydefaulttheclassinstancedoes.
>>>loss_fn=tf.keras.losses.mean_squared_error
>>>loss_fn(tf.ones((2,2,)),tf.zeros((2,2)))
>>>loss_fn=tf.keras.losses.MeanSquaredError()
>>>loss_fn(tf.ones((2,2,)),tf.zeros((2,2)))
Whenusingfit(),thisdifferenceisirrelevantsincereductionishandledbytheframework.
Here'showyouwouldusealossclassinstanceaspartofasimpletrainingloop:
loss_fn=tf.keras.losses.CategoricalCrossentropy(from_logits=True)
optimizer=tf.keras.optimizers.Adam()
#Iterateoverthebatchesofadataset.
forx,yindataset:
withtf.GradientTape()astape:
logits=model(x)
#Computethelossvalueforthisbatch.
loss_value=loss_fn(y,logits)
#Updatetheweightsofthemodeltominimizethelossvalue.
gradients=tape.gradient(loss_value,model.trainable_weights)
optimizer.apply_gradients(zip(gradients,model.trainable_weights))
Creatingcustomlosses
Anycallablewiththesignatureloss_fn(y_true,y_pred)
thatreturnsanarrayoflosses(oneofsampleintheinputbatch)canbepassedtocompile()asaloss.
Notethatsampleweightingisautomaticallysupportedforanysuchloss.
Here'sasimpleexample:
defmy_loss_fn(y_true,y_pred):
squared_difference=tf.square(y_true-y_pred)
returntf.reduce_mean(squared_difference,axis=-1)#Notethe`axis=-1`
model.compile(optimizer='adam',loss=my_loss_fn)
Theadd_loss()API
Lossfunctionsappliedtotheoutputofamodelaren'ttheonlywayto
createlosses.
Whenwritingthecallmethodofacustomlayerorasubclassedmodel,
youmaywanttocomputescalarquantitiesthatyouwanttominimizeduring
training(e.g.regularizationlosses).Youcanusetheadd_loss()layermethod
tokeeptrackofsuchlossterms.
Here'sanexampleofalayerthataddsasparsityregularizationlossbasedontheL2normoftheinputs:
fromtensorflow.keras.layersimportLayer
classMyActivityRegularizer(Layer):
"""Layerthatcreatesanactivitysparsityregularizationloss."""
def__init__(self,rate=1e-2):
super(MyActivityRegularizer,self).__init__()
self.rate=rate
defcall(self,inputs):
#Weuse`add_loss`tocreatearegularizationloss
#thatdependsontheinputs.
self.add_loss(self.rate*tf.reduce_sum(tf.square(inputs)))
returninputs
Lossvaluesaddedviaadd_losscanberetrievedinthe.losseslistpropertyofanyLayerorModel
(theyarerecursivelyretrievedfromeveryunderlyinglayer):
fromtensorflow.kerasimportlayers
classSparseMLP(Layer):
"""StackofLinearlayerswithasparsityregularizationloss."""
def__init__(self,output_dim):
super(SparseMLP,self).__init__()
self.dense_1=layers.Dense(32,activation=tf.nn.relu)
self.regularization=MyActivityRegularizer(1e-2)
self.dense_2=layers.Dense(output_dim)
defcall(self,inputs):
x=self.dense_1(inputs)
x=self.regularization(x)
returnself.dense_2(x)
mlp=SparseMLP(1)
y=mlp(tf.ones((10,10)))
print(mlp.losses)#Listcontainingonefloat32scalar
Theselossesareclearedbythetop-levellayeratthestartofeachforwardpass--theydon'taccumulate.
Solayer.lossesalwayscontainonlythelossescreatedduringthelastforwardpass.
Youwouldtypicallyusetheselossesbysummingthembeforecomputingyourgradientswhenwritingatrainingloop.
#Lossescorrespondtothe*last*forwardpass.
mlp=SparseMLP(1)
mlp(tf.ones((10,10)))
assertlen(mlp.losses)==1
mlp(tf.ones((10,10)))
assertlen(mlp.losses)==1#Noaccumulation.
Whenusingmodel.fit(),suchlosstermsarehandledautomatically.
Whenwritingacustomtrainingloop,youshouldretrievetheseterms
byhandfrommodel.losses,likethis:
loss_fn=tf.keras.losses.CategoricalCrossentropy(from_logits=True)
optimizer=tf.keras.optimizers.Adam()
#Iterateoverthebatchesofadataset.
forx,yindataset:
withtf.GradientTape()astape:
#Forwardpass.
logits=model(x)
#Lossvalueforthisbatch.
loss_value=loss_fn(y,logits)
#Addextralosstermstothelossvalue.
loss_value+=sum(model.losses)
#Updatetheweightsofthemodeltominimizethelossvalue.
gradients=tape.gradient(loss_value,model.trainable_weights)
optimizer.apply_gradients(zip(gradients,model.trainable_weights))
Seetheadd_loss()documentationformoredetails.
Losses
▻
Availablelosses
Probabilisticlosses
Regressionlosses
Hingelossesfor"maximum-margin"classification
▻
Usageoflosseswithcompile()&fit()
▻
Standaloneusageoflosses
▻
Creatingcustomlosses
▻
Theadd_loss()API