




版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
Multi-styleGenerativeNetworkforReal-timeTransfer
HangZhangKristinDana
DepartmentofElectricalandComputerEngineering,RutgersUniversity
zhang.hang@,kdana@
Abstract
Recentworkinstyletransferlearnsafeed-forward
generativenetworktoapproximatetheprioroptimization-
basedapproaches,resultinginreal-timeperformance.
However,thesemethodsrequiretrainingseparatenetworks
fordifferenttargetstyleswhichgreatlylimitsthescalabil-
ity.WeintroduceaMulti-styleGenerativeNetwork(MSG-
Net)withanovelInspirationLayer,whichretainsthefunc-
tionalityofoptimization-basedapproachesandhasthefast
speedoffeed-forwardnetworks.TheproposedInspiration
Layerexplicitlymatchesthefeaturestatisticswiththetar-
getstylesatruntime,whichdramaticallyimprovesversa-
tilityofexistinggenerativenetwork,sothatmultiplestyles
canberealizedwithinonenetwork.TheproposedMSG-Net
matchesimagestylesatmultiplescalesandputsthecom-
putationalburdenintothetraining.Thelearnedgenerator
isacompactfeed-forwardnetworkthatrunsinreal-time
aftertraining.Comparingtopreviouswork,theproposed
networkcanachievefaststyletransferwithatleastcom-
parablequalityusingasinglenetwork.Theexperimental
resultshavecovered(butarenotlimitedto)simultaneous
trainingoftwentydifferentstylesinasinglenetwork.The
completesoftwaresystemandpre-trainedmodelswillbe
publiclyavailableuponpublication1.
1.Introduction
Styletransfercanbeapproachedasreconstructingor
synthesizingtexturebasedonthetargetimagesemantic
content[24].Manypioneeringworkshaveachievedsuc-
cessinclassictexturesynthesisstartingwithmethodsthat
resamplepixels[8,9,23,36]ormatchmultiscalefeature
statistics[7,16,29].Thesemethodsemploytraditionalim-
agepyramidsobtainedbyhandcraftedmultiscalelinear?l-
terbanks[1,32]andperformtexturesynthesisbymatching
thefeaturestatisticstothetargetstyleatmultiplescales.
1/zhanghang1989/MSG-Net
Figure1:Theproblemofmulti-styletransferinreal-time
usingasinglenetworkissolvedinthispaper.Examplesof
transferedimagesandthecorrespondingstyles.
Inrecentyears,theconceptsoftexturesynthesisandstyle
transferhavebeenrevisitedwithinthecontextofdeeplearn-
ing.Featurehistogramsatpyramidlevelsintraditional
methodsarereplacedwithGramMatrixrepresentationfrom
convolutionalneuralnets(CNN).Gatysetal.[11,12]?rst
adoptsapre-trainedCNNasadescriptiverepresentation
ofimagestatisticsandprovidesanexplicitrepresentation
thatseparatesimagestyleandcontentinformation.This
frameworkhasachievedgreatsuccessinbothtexturesyn-
thesisandstyletransfer.Thismethodisoptimization-based
becausethenewtextureimageisgeneratedbyapplying
gradientdescentthatmanipulatesawhitenoiseimageto
matchtheGramMatrixrepresentationofthetargetimage.
Optimization-basedapproacheshavenoscalabilityproblem
butrequireexpensivecomputationtogenerateimagesusing
gradientdescent.
Recentwork[21,34]trainafeed-forwardgenerativenet-
worktoapproximatetheoptimizationprocessthattrans-
formstheimageintoatargetstyleinreal-timebymov-
1
Figure2:AnoverviewofMSG-Net,Multi-styleGenerativeNetwork.Thetransformationnetworkaspartofthegenerator
explicitlymatchesthefeaturesstatisticscapturedbyadescriptivenetworkofthestyletargetsusingtheproposedInspiration
LayerdenotedasIns(introducedinSection3).DetailedarchitectureofthetransformationnetworkisshowninTable1.A
pre-trainedlossnetworkprovidesthesupervisionofMSG-Netlearningbyminimizingthecontentandstyledifferenceswith
thetargetsasdiscussedinSection4.2.
ingthecomputationalburdenintotrainingprocess.How-
ever,theseapproachesrequiretrainingseparatenetworks
foreachdifferentstyle,whichextremelylimitsthescalabil-
ity.Chenetal.[3]adoptahybridsolutionbyseparating
mid-levelconvolutional?ltersindividuallyforeachstyle
likeahardswitch,andsharethedown-samplingandup-
samplingpartsacrossdifferentstyles.Nevertheless,thesize
ofthenetworkisstillproportionaltothenumberofstyles,
whichwillbeproblematicforhundredsofstyles.
Whatlimitsthediversityofstylesinexistinggenerative
network?Existingworklearnsagenerativenetworktak-
inganinputimagexcandprovidingthetransferredoutput
G(xc),inwhichthefeaturestatisticsofthestyleimageare
implicitlylearnedfromthelossfunctionwithoutinforming
thenetworkaboutthestyletargetxs[21,25,34].Forexist-
ingapproaches,thereisafundamentaldif?cultyinderiving
arepresentationthatsimultaneouslypreservesthesemantic
contentofinputimagexcandmatchesthestyleofthetarget
imagexs(seeextendeddiscussioninSection4.3).Inorder
tobuildamulti-stylegenerativenetworkG(xc,xs),where
xscanbechosenfromadiversesetofstyles,thegenerator
networkshouldexplicitlymatchthefeaturestatisticsofthe
styletargetimagesatruntime.
Asthe?rstcontributionofthepaper,weintroduce
anInspirationLayerwhichmatchesthefeaturestatistics
(GramMatrix)ofthetargetandpreservestheinputseman-
ticcontentsatruntime.TheInspirationLayerisend-to-end
learnablewithexistinggenerativenetworkarchitecturesand
putsthecomputationalburdenintothetrainingprocessto
achievereal-timestylematching.TheproposedInspiration
Layerenablesmulti-stylegenerationfromasinglenetwork
whichdramaticallyimprovestheversatilityofexistinggen-
erativenetworkarchitectures.
Thesecondcontributionofthispaperislearninga
novelfeed-forwardgenerativenetworkforreal-timemulti-
stylematching,whichwerefertoasMulti-styleGenerative
Network(MSG-Net).TheInspiratonLayerisacomponent
ofMSG-Net.Thisproposednetworkexplicitlymatchesthe
featurestatisticsatmultiplescalesinordertoretaintheper-
formanceofoptimization-basedmethodsandachievethe
speedoffeed-forwardnetworks.Thenetworkdesignbene-
?tsfromrecentadvancesinCNNarchitecturecalledresid-
ualblock[14],whichreducescomputationalcomplexity
withoutlosingstyleversatilitybypreservinglargernum-
berofchannels.Wefurtherdesignanupsamplingresidual
blocktoallowpassingidentityallthewaythroughthegen-
erativenetwork,enablingthenetworktoextenddeeperand
convergefaster.TheexperimentalresultsshowthatMSG-
Netcanachievereal-timestyletransferwithcomparable
image?delitycomparedtopreviouswork.
Thepaperisorganizedasfollows.Webrie?ydescribe
thepriorworkoncontentandstylerepresentationusing
aCNNframeworkinSection2.Weintroduceourpro-
posedInspirationLayerinSection3andournovelgener-
ativearchitectureMulti-styleGenerativeNetworkinSec-
tion4.Thecomparisontootherapproachesisdiscussedin
Section4.3.Finally,theexperimentalresultsandcompar-
isonsarepresentedinSection5.
2.ContentandStyleRepresentation
CNNspre-trainedonaverylargedatasetsuchasIma-
geNetcanberegardedasdescriptiverepresentationsofim-
agestatisticscontainingbothsemanticcontentandstylein-
formation.Gatysetal.[12]providesexplicitrepresenta-
tionsthatindependentlymodeltheimagecontentandstyle
fromCNNs,whichwebrie?ydescribeinthissectionfor
2
(a)targets(b)scale1(c)scale2(d)scale3(e)allscales
Figure3:Visualizingtheeffectsofmulti-scalefeaturestatisticsforstyletransfer(top)andtexturesynthesis(bottom).(First
column)Inputtargets;(Centercolumns)Invertedrepresentationsateachindividualscale;(LastColumn)Invertedrepresen-
tationcombiningthemultiplescales.Weusepre-trained16-layerVGGasthedescriptivenetworkandconsiderthesizeof256×256astheoriginalscaleandreducethesizebydividing2
i?1fori-thscale.ThefeaturemapsafterReLUateachscale
areused.
completeness.
Thesemanticcontentoftheimagecanberepresented
astheactivationsofthedescriptivenetworkati-thscale
Fi(x)∈R
Ci×Hi×Wiwithagiventheinputimagex,where
theCi,HiandWiarethenumberoffeaturemapchannels,
featuremapheightandwidth.Thetextureorstyleofthe
imagecanberepresentedasthedistributionofthefeatures
usingGramMatrixG(Fi(x))∈R
Ci×Cigivenby
whereαisatrade-offparameterthatbalancingthecontri-
butionofthecontentandstyletargets.
Theminimizationoftheaboveproblemissolvableby
usinganiterativeapproach,butitisinfeasibletoachieveit
inreal-timeormakethemodeldifferentiable.However,we
canstillapproximatethesolutionandputthecomputational
burdentothetrainingstage.Weintroduceanapproximation
whichtunesthefeaturemapbasedonthetargetstyle:
XHiXWi
Gi(x)
FFh,w(x)F
=h,w(x)
.(1)
T
ii
h=1w=1
TheGramMatrixisorderlessanddescribesthefeaturedis-
tributions.Forzero-centereddata,theGramMatrixisthe
sameasthecovariancematrixscaledbythenumberofel-
ementsCi×Hi×W
i.Itcanbecalculatedef?cientlyby
?rstreshapingthefeaturemapΦi(x)
F∈R
Ci×(HiWi),
whereΦ()isareshapingoperation.ThentheGramMatrix
canbewrittenasGi(x)=Φi(x)Φi(x)
FFF
T.
3.InspirationLayer
Inthissection,weintroduceInspirationLayer,which
explicitlymatchesmulti-scalefeaturestatisticsbasedonthe
givenstyles.Foragivencontenttargetxcandastyletarget
xs,thecontentandstylerepresentationsatthei-thscale
usingthedescriptivenetworkcanbewrittenasF
i(xc)and
G(F
i(xs)),respectively.AdirectsolutionY?
iisdesirable
whichpreservesthesemanticcontentofinputimageand
matchesthetargetstylefeaturestatistics:
?
i=argmini?Fi(xc)k
2
Yi{kY
F
Y
+αkG(YFkF}.
i)?Gi(xs)
2
(2)
3
T
YF
i=Φ?1h
Φi(xc)
WG
F
i(xs)i
,(3)
?
T
whereW∈RCi×Ciisalearnableweightmatrixand
Φ()isareshapingoperationtomatchthedimension,
sothatΦi(xc)
F∈RCi×(HiWi).Forintuitionon
thefunctionalityofW,supposeW=Gi(xs)
F
?1,
thenthe?rstterminEquation2isminimized.Nowlet
W=Φi(xc)i(xs))?1,whereLi(xs)
F(FF
?TL
is
obtainedbytheCholeskyDecompositionofGi(xs)
F
=
Li(xs)Li(xs)
FF
T,thenthesecondtermofEqua-
tion2isminimized.WeletWbelearneddirectlyfrom
thelossfunctiontodynamicallybalancethetrade-off.
End-to-endLearningTheInspirationLayerisdifferen-
tiablewithrespecttobothlayerinputandthelayerweights.
Therefore,theInspirationLayercanbelearnedbystandard
StochasticGradientDecent(SGD)solver.Weprovideex-
plicitexpressionforthederivedbackpropagationequations
inthesupplementarymaterial.
layernameoutputsizeMSG-Net
Figure4:Weextendtheoriginaldown-samplingresidual
architecture(left)toanup-samplingversion(right).We
usea1×1fractionally-stridedconvolutionasashortcutand
adoptre?ectancepadding.
4.Multi-styleGenerativeNetwork
4.1.NetworkArchitecture
Existingfeed-forwardbasedstyletransferworklearnsa
generatornetworkthattakesonlythecontentimageasthe
inputandoutputsthetransferredimage,i.e.thegenerator
networkcanbeexpressedasG(xc),whichimplicitlylearns
thefeaturestatisticsfromthelossfunction.Weintroduce
aMulti-styleGenerativeNetworkwhichtakesbothcontent
andstyletargetasinputs.i.e.G(xc,xs).Theproposed
networkexplicitlymatchesthefeaturestatisticsofthestyle
targetsatmultiplescales.
AspartoftheGeneratorNetwork,weadopta16-layer
pre-trainedVGGnetwork[33]asadescriptivenetwork
Fwhichcapturesthefeaturestatisticsofthestyleim-
agexsatdifferentscales,andoutputstheGramMatrices
{G(Fi(xs))}(i=1,...K)whereKistotalnumberof
scales.Thenatransformationnetworktakesthecontentim-
agexcandmatchesthefeaturestatisticsofthestyleimage
atmultiplescaleswithInspirationLayers.Weadoptthe
VGGnetworkthatispre-trainedonImageNet[31]asthe
descriptivenetwork,becausethenetworkfeatureslearned
fromadiversesetofimagesarelikelytobegenericand
informative.
Multi-scaleProcessingFigure3illustratestheimpactof
multi-scalerepresentationbycomparingthereconstruction
resultfromthefeaturestatisticsatindividualscaleswiththe
resultofcombiningmultiplescales.Weuseapre-trained
16-layerVGGasadescriptivenetworkanduseGatys’ap-
proachtoinverttherepresentation[11].Thisexperiment
suggeststhatfeaturestatisticsatindividualscalesarenot
informativeenoughtoreconstructtexturesorstyles.Multi-
scalefeaturestatisticsprovideacomprehensiverepresen-
tationofthetexturesandstyles.Therefore,weintroduce
amulti-scaleInspirationarchitecturetomatchthefeature
statisticsatfourdifferentscalesasshowninFigure2.
Conv1256×2567×7,64
Inspiration1256×256C=64
Res1128×128
1×1,32
3×3,32
×k
1×1,128
Inspiration2128×128C=128
Res26464×
11,64×
3×3,64
×k
1×1,256
Inspiration364×64C=256
Res33232×
11,128×
3×3,128
1×1,512
×k
Inspiration432×32C=512
Up-Res164×64
1×1,64
3×3,64
×k
1×1,256
Up-Res2128×128
1×1,32
3×3,32×k
1×1,128
Up-Res3256×256
1×1,16
×k
3×3,16
1×1,64
Conv256×2567×7,3
Table1:Thearchitectureofthetransformationnetwork
with(18k+6)layers,whichisthecorepartofMulti-style
GenerativeNetwork(MSG-Net).
Up-sampleResidualBlock.Deepresiduallearninghas
achievedgreatsuccessinvisualrecognition[14,15].Resid-
ualblockarchitectureplaysanimportantrolebyreducing
thecomputationalcomplexitywithoutlosingdiversityby
preservingthelargenumberoffeaturemapchannels.We
extendtheoriginalarchitecturewithanup-samplingver-
sionasshowninFigure4(right),whichhasafractionally-
stridedconvolution[27]astheshortcutandadoptsre-
?ectancepaddingtoavoidartifactsofthegenerativepro-
cess.Thisup-samplingresidualarchitectureallowsusto
passidentityallthewaythroughthenetwork(asshownin
Table1),sothatthenetworkconvergesfasterandextends
deeper.
OtherDetails.Weonlyusein-networkdown-sample
(convolutional)andup-sample(fractionally-stridedcon-
volution)inthetransformationnetworkasinprevious
work[21,30].Weusere?ectancepaddingtoavoidartifacts
attheborder.Instancenormalization[35]andReLUare
usedafterweightlayers(convolution,fractionally-strided
4
convolutionandtheInspirationLayer),whichimprovesthe
generatedimagequalityandisrobusttotheimagecontrast
changes.
4.2.NetworkLearning
Styletransferisanopenproblem,sincethereisnogold-
standardground-truthtofollow.Wefollowpreviouswork
tominimizeaweightedcombinationofthestyleandcontent
differencesofthegeneratornetworkoutputsandthetargets
foragivenpre-trainedlossnetworkF[21,34].Letthegen-
eratornetworkbedenotedbyG(xc,xs)parameterizedby
weightsWG.Learningproceedsbysamplingcontentim-
agesxc~Xcandstyleimagesxs~X
sandthenadjusting
theparametersWGofthegeneratorG(xc,xs)inorderto
minimizetheloss:
?
Ex
WG=argminc,xs{
WG
2
λckF
c(G(xc,xs))?Fc(xc)k
F
XK
2
+λsF?G(F
kGi(G(xc,xs))
i(xs))k
F
i=1
+λTV`TV(G(xc,xs))},
(4)
whereλcandλsarethebalancingweightsforcontentand
stylelosses.Weconsiderimagecontentatscalecandim-
agestyleatscalesi∈{1,..K}.`
TV()isthetotalvaria-
tionregularizationasusedpriorworkforencouragingthe
smoothnessofthegeneratedimages[21,28,38].
4.3.RelationtoOtherMethods
RelationtoPyramidMatching.Earlymethodsfor
texturesynthesisweredevelopedusingmulti-scaleimage
pyramids[7,16,29,36].Thediscoveryintheseearlier
methodswasthatrealistictextureimagescouldbesynthe-
sizedfrommanipulatingawhitenoiseimagesothatitsfea-
turestatisticswerematchedwiththetargetateachpyramid
level.Ourapproachisinspiredbyclassicmethods,which
matchesfeaturestatisticsatmultipleimagescales,butit
leveragestheadvantagesofdeeplearningnetworkswhile
placingthecomputationalcostsintothetrainingprocess.
RelationtoFusionLayers.OurproposedInspiration
Layerisakindoffusionlayerthattakestwoinputs(con-
tentandstylerepresentations).Currentworkinfusion
layerswithCNNsincludefeaturemapconcatenationand
element-wisesum[10,19,37].However,theseapproaches
arenotdirectlyapplicable,sincethereisnoseparationof
stylestylefromcontent.Forstyletransfer,thegenerated
imagesshouldnotcarrysemanticinformationofthestyle
targetnorstylesofthecontentimage.Inaddition,input
representationsizemustmatchinpriorfusionmethods;but
StorageTrainingTimeTestTime
OptimizationBased[12,24]O(1)N/Aslow
Feed-forward[3,21,34]O(N)O(N)real-time
MSG-Net(ours)O(1)O(N)real-time
Table2:Comparedtoexistingmethods,MSG-Nethasthe
bene?tofreal-timestyle-transferoffeed-forwardbasedap-
proachesaswellasthescalabilityofclassicapproaches.
forstyletransfer,thecontentrepresentationhasthedimen-
sionofCi×Hi×W
iandtheorderlessstylerepresentation
(GramMatrix)hasthedimensionofCi×Ci.
RelationtoGenerativeNetworksandAdversarial
Training.GenerativeAdversarialNetwork(GAN)[13],
whichjointlytrainsanadversarialgeneratoranddiscrimina-
torsimultaneously,hascatalyzedasurgeofinterestinthe
studyofimagegeneration[2,18,19,30,37].Recentwork
onimage-to-imageGAN[19]adoptsaconditionalGANto
provideageneralsolutionforsomeimage-to-imagegenera-
tionproblems.Forthoseproblems,itwaspreviouslyhardto
de?nealossfunction.However,thestyletransferproblem
cannotbetackledusingtheconditionalGANframework,
duetomissingground-truthimagepairs.Instead,wefol-
lowthework[21,34]toadoptadiscriminator/lossnetwork
thatminimizestheperceptualdifferenceofsynthesizedim-
ageswithcontentandstyletargetsandprovidesthesuper-
visionofthegenerativenetworklearning.Theinitialidea
ofemployingGramMatrixtotriggerthestylessynthesisis
inspiredbyarecentwork[2]thatsuggestsusinganencoder
insteadofrandomvectorinGANframework.
Concurrentwithourwork.Concurrentwork[4,17]
exploresarbitrarystyletransfer.Astyleswaplayerispro-
posedin[4],butgetslowerqualityandslowerspeed(com-
paredtoexistingfeed-forwardapproaches).Anadaptive
instancenormalizationisintroducedin[17]tomatchthe
meanandvarianceofthefeaturemapswiththestyletar-
get.Instead,ourInspirationLayermatchesthesecondorder
statisticsofGramMatricesforthefeaturemapsatmultiple
scales.Wealsoexploreapplyingourmethodtonewstyles
(notseenduringtraining)inFigure8.
5.ExperimentalResults
Inthissection,wemakequalitativecomparisonofthe
proposedMSG-Netwithexistingapproachesforstyletrans-
fertask.Weconsiderthegold-standardoptimizationbased
workofGatysetal.[12]andthestate-of-the-artfeed-
forwardapproachofJohnsonetal.[21]withInstanceNor-
malization[35].Additionally,weshowMSG-Netcanbe
appliedtotexturesynthesistask.
5
(a)input(b)Gatys[12](c)Johnson[21](d)MSG-Net(ours)
Figure5:Comparisontoexistingapproaches.OurproposedMSG-Nethasdramaticallyimprovedthescalabilityofthe
generativenetworkandachievescomparableresultswithexistingworkforeachindividualstyle.
5.1.StyleTransfer
Baselines.Weadoptapubliclyavailableimplementa-
tion[20]ofGatysetal.[12]asagoldenstandardbaseline
ofoptimizationbasedapproaches.Giventhecontentim-
agexc,styleimagexsandapre-trained16-layerVGG[33]
asadescriptivenetworkF.Consideringthecontentrecon-
structionatc-thscaleandthestylereconstructionatscales
i∈{1,...K},animagey?isinitializedwithwhilenoiseand
updatedbyiterativelyminimizingtheobjectivefunction:
y?=argmin
y{λckF
c(y)?Fc(xc)k2
F
X
K
(5)
+λskG(F
i(y))?G(Fi(xs))k2
F
i=1
+λTV`TV(y)}.
TheoptimizationisperformedusingL-BFGSsolverfor500
iterations.Themethodisslowduetoforwardingandback-
wardingtheimageyateachiteration.
Wealsocompareourapproachwithanimprovedversion
ofrecentfeed-forwardwork[21]usinginstancenormaliza-
tion[35]asthestate-of-the-artapproach,wherewetrain
eachfeed-forwardnetworkindividuallyfordifferentstyle
imagesusingthelossfunctionsameasEquation4.
MethodDetails.16-layerVGGnetwork[33]areused
inthedescriptivenetworkaspartoftheMSG-Netandthe
lossnetworkinEquation4.Forbothnetworks,weconsider
thestylerepresentationat4differentscalesusingthelay-
ersReLU12,ReLU22,ReLU33andReLU43.Forloss
network,weconsiderthecontentrepresentationatthelayer
ReLU22.TheMicrosoftCOCOdataset[26]isusedasthe
contentimageimagesetXc,whichhasaround80,000nat-
uralimages.Wecollect20styleimages,choosingthose
thataretypicallyusedinpreviouswork.42-layerMSG-Net
isusedinthisexperiment(k=2asshowninTable1).
Wefollowthework[21,34]andadoptAdam[22]totrain
thenetworkwithalearningrateof1×10?
3.Forlearning
thenetwork,weusethelossfunctionasdescribedinEqua-
tion4withthebalancingweightsλc=1,λs=5,λTV=
1×10?6forcontent,styleandtotalregularization.Were-
sizethecontentimagesxc~Xcto256×256andlearn
thenetworkwithabatchsizeof4for4,000×Nstyleiter-
ations.Weiterativelyupdatethestyleimagexsevery20
iterations2withsizeof512×512
3.Aftertraining,theMSG-
Netcanacceptarbitraryinputimagesize,andweresizethe
inputimagesto512alongthelongedgebeforefedintothe
networkduringthetestinthisexperiment.Ourimplemen-
tationisbasedonTorch[6],whichtakesroughly8hoursfor
training20-styleMSG-NetmodelonasingleTitanXPas-
calGPU,whichis10timesfasterthanJohnson’sapproach
forthesamenumberofstylesbecausethejointoptimization
acrossdifferentstylesbene?tsfromeachother.
QualitativeComparisonWekeepthesamehyperpa-
rametersforMSG-Netandbaselines,suchasbalancing
weights.ForoptimizationbasedGatys’approach[12],we
stoptheoptimizationafter500iterationsthatistypically
morethanenough.Thefeed-forwardbaselinemodelus-
ingJohnson’sapproach[21]islearnedfor40,000iterations
foreachstylemodelassameasintheoriginalpaper.We
2Thenumberof20isnotempiricallychosen,wedidagridsearch
varyingfrom4to20andchosetheonewithbestquality.
3Weusealargestyleimage,whichprovidesmoretexturedetailsand
improvesthequalitycomparedtothesizeof256×256.
6
Figure6:DiverseofimagesthataregeneratedusingasingleMSG-Net.Firstrowshowstheinputcontentimagesandthe
otherrowsaregeneratedimageswithdifferentstyletargets(?rstcolumn).
7
Figure7:TexturesynthesisexamplesthataregeneratedusingasingleMSG-Netandthecorrespondingtexturetargets.
Figure8:TestamodelwiththestyletargetsthathaveNOT
beencoveredduringthetraining.
networkandtrainingstrategyasinstyletransfertask.We
followpreviouswork[25,34]andfeedtheMSG-Netwith
randomnoisetotriggerthetexturesynthesis.Lietal.[25]
suggeststhatusingBrownnoisethatcontainingspectrumof
frequenciesproducesthetextureswithbetterqualitythan
usingwhitenoise.Wefurtherdiscoverthattriggeringthe
networkusingastructurednoisethatcontainingbothdif-
ferentfrequenciesandvariousintensitiesresultsintextures
withevenbetterquality.Examplesoftexturesynthesisare
showninFigure7andtherightcolumnofFigure6
traintheproposed20-styleMSG-Netfor80,000iterations
(4,000×N
style).Astandardhistogrammatchingisaddedas
apost-processingtoalltheapproaches,whichaddsaslight
improvementoftheperceivedcolor.Figure5showsthe
comparisonofthreeapproachesusingpopularpicturesof
LenaandAtlantacitywithtwopopularstyles.Wecansee
thattheoptimizationbasedapproachismorecolorfuland
hassharpertexturesthanfeed-forwardapproaches(John-
son’sandMSG-Net),suchasthebuildingsandtheroads
inthepicturesofAtlanta(bottomrow).Feed-forwardap-
proacheshavetheadvantagesofpreservingsemanticcon-
sistencies,suchashumanfaceandhairsinthepictureof
Lena(toprow),becausethemodelsaretrainedonMS-
COCOdatasetwhichcontainsalotofcontextinformation
ofreal-wordimages.Moreexamplesofthetransferedim-
agesusingMSG-NetareshowninFigure6.Ingeneral,our
proposedMSG-Netdramaticallyimprovethescalabilityof
thenetworkforstyletransferandhasatleastcomparable
qualitycomparingtoexistingwork.
5.2.TextureSynthesis
Thetexturesynthesiscanberegardedasaspecialcase
ofstyletransfer,inwhichthecontentimageisnotinvolved
andthegoalistoreconstructthetextures.Inthissection,
weexplorehowourapproachcanbeappliedontexturesyn-
thesistask.10texturesareselectedfromDescribableTex-
tureDataset(DTD)[5]asthetargets.Weadoptthesame
6.ConclusionandDiscussion
Theproblemofmulti-styletransferinreal-timeusinga
singlenetworkhasbeenaddressedinthispaper.Wetackle
thetechnicaldif?cultiesofexistingapproachesbyintro-
ducinganovelInspirationLayer,whichexplicitlymatches
thetargetstylesatruntime.Wehavedemonstratedthat
theInspirationLayerembeddedinourproposedMulti-style
GenerativeNetworkenables20stylestransferwithoutlos-
ingquality.TheproposedMSG-Netputsthecomputational
burdenofmatchingfeaturestatisticsinthetrainingprocess,
whichenablesreal-timetransfer.Itrunsat17.8frame/sec
fortheinputimageofsize256×256onasingleTitanX
PascalGPU.
However,dealingwithunknownstyleisstillanun-
solvedproblemforfeed-forwardapproaches.Thestrategy
ofputtingburdenintotraininglimitstheperformanceon
unknownstyleimagesasshowninFigure8.Thiscanbe
potentiallysolvedbylargevarietiesoftrainingstylesand
betterstylerepresentationmodel,sothattheinterpolation
betweendifferentstylescanbelearnedbythegeneratornet-
work.
Acknowledgment
ThisworkwassupportedbyNationalScienceFounda-
tionawardIIS-1421134.AGPUusedforthisresearchwas
donatedbytheNVIDIACorporation.
8
References
[1]P.BurtandE.Adelson.Thelaplacianpyramidasacom-
pactimagecode.IEEETransactionsoncommunications,
31(4):532–540,1983.1
[2]T.Che,Y.Li,A.P.Jacob,Y.Be
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025汽車(chē)交易合同范本
- 2025版合同協(xié)議模板
- 路橋施工公司推土機(jī)租賃合同
- 木材板材購(gòu)銷(xiāo)合同范本
- 個(gè)人借款還款合同范本
- 2025北京辦公室租賃合同范本
- 2025合伙型投資基金合同范本
- 廣播對(duì)接協(xié)議書(shū)范本
- 產(chǎn)權(quán)移交協(xié)議書(shū)范本
- 2025年03月浙江杭州市蕭山區(qū)事業(yè)單位招錄50人筆試歷年典型考題(歷年真題考點(diǎn))解題思路附帶答案詳解
- 醫(yī)保業(yè)務(wù)培訓(xùn)大綱
- 2025年中國(guó)短圓柱滾子軸承市場(chǎng)調(diào)查研究報(bào)告
- 教師的情緒管理課件
- 湖北省十一校2024-2025學(xué)年高三第二次聯(lián)考數(shù)學(xué)試卷(解析版)
- 英語(yǔ)-華大新高考聯(lián)盟2025屆高三3月教學(xué)質(zhì)量測(cè)評(píng)試題+答案
- 《手工制作》課件-幼兒園掛飾
- 【初中地理】西亞+課件-2024-2025學(xué)年人教版地理七年級(jí)下冊(cè)
- 鼓勵(lì)員工發(fā)現(xiàn)安全隱患的獎(jiǎng)勵(lì)制度
- 蘇教版一年級(jí)下冊(cè)數(shù)學(xué)全冊(cè)教學(xué)設(shè)計(jì)(配2025年春新版教材)
- 【特易資訊】2025中國(guó)二手車(chē)行業(yè)出口分析及各國(guó)進(jìn)口政策影響白皮書(shū)
- (一診)2025年蘭州市高三診斷考試生物試卷(含官方答案)
評(píng)論
0/150
提交評(píng)論