AIGC綜述:從GAN到ChatGPT的生成式人工智能簡史 A Comprehensive Survey of AI-Generated Content (AIGC):A History of Generative AI from GAN to ChatGPT_第1頁
AIGC綜述:從GAN到ChatGPT的生成式人工智能簡史 A Comprehensive Survey of AI-Generated Content (AIGC):A History of Generative AI from GAN to ChatGPT_第2頁
AIGC綜述:從GAN到ChatGPT的生成式人工智能簡史 A Comprehensive Survey of AI-Generated Content (AIGC):A History of Generative AI from GAN to ChatGPT_第3頁
AIGC綜述:從GAN到ChatGPT的生成式人工智能簡史 A Comprehensive Survey of AI-Generated Content (AIGC):A History of Generative AI from GAN to ChatGPT_第4頁
AIGC綜述:從GAN到ChatGPT的生成式人工智能簡史 A Comprehensive Survey of AI-Generated Content (AIGC):A History of Generative AI from GAN to ChatGPT_第5頁
已閱讀5頁,還剩85頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

arXiv:2303.04226v1[cs.AI]7Mar2023

AComprehensiveSurveyofAI-GeneratedContent(AIGC):AHistoryofGenerativeAIfromGANtoChatGPT

YIHANCAO*,LehighUniversity&CarnegieMellonUniversity,USA

SIYULI,LehighUniversity,USA

YIXINLIU,LehighUniversity,USA

ZHILINGYAN,LehighUniversity,USA

YUTONGDAI,LehighUniversity,USA

PHILIPS.YU,UniversityofIllinoisatChicago,USA

LICHAOSUN,LehighUniversity,USA

Recently,ChatGPT,alongwithDALL-E-2[

1

]andCodex[

2

],hasbeengainingsignificantattentionfromsociety.Asaresult,manyindividualshavebecomeinterestedinrelatedresourcesandareseekingtouncoverthebackgroundandsecretsbehinditsimpressiveperformance.Infact,ChatGPTandotherGenerativeAI(GAI)techniquesbelongtothecategoryofArtificialIntelligenceGeneratedContent(AIGC),whichinvolvesthecreationofdigitalcontent,suchasimages,music,andnaturallanguage,throughAImodels.ThegoalofAIGCistomakethecontentcreationprocessmoreefficientandaccessible,allowingfortheproductionofhigh-qualitycontentatafasterpace.AIGCisachievedbyextractingandunderstandingintentinformationfrominstructionsprovidedbyhuman,andgeneratingthecontentaccordingtoitsknowledgeandtheintentinformation.Inrecentyears,large-scalemodelshavebecomeincreasinglyimportantinAIGCastheyprovidebetterintentextractionandthus,improvedgenerationresults.Withthegrowthofdataandthesizeofthemodels,thedistributionthatthemodelcanlearnbecomesmorecomprehensiveandclosertoreality,leadingtomorerealisticandhigh-qualitycontentgeneration.Thissurveyprovidesacomprehensivereviewonthehistoryofgenerativemodels,andbasiccomponents,recentadvancesinAIGCfromunimodalinteractionandmultimodalinteraction.Fromtheperspectiveofunimodality,weintroducethegenerationtasksandrelativemodelsoftextandimage.Fromtheperspectiveofmultimodality,weintroducethecross-applicationbetweenthemodalitiesmentionedabove.Finally,wediscusstheexistingopenproblemsandfuturechallengesinAIGC.

CCSConcepts:.Computersystemsorganization→Embeddedsystems;Redundancy;Robotics;.Net-works→Networkreliability.

AdditionalKeyWordsandPhrases:datasets,neuralnetworks,gazedetection,texttagging

ACMReferenceFormat:

YihanCao,SiyuLi,YixinLiu,ZhilingYan,YutongDai,PhilipS.Yu,andLichaoSun.2018.AComprehensive

SurveyofAI-GeneratedContent(AIGC):AHistoryofGenerativeAIfromGANtoChatGPT.J.ACM37,4,Article111(August2018),

44

pages.

.org/XXXXXXX.XXXXXXX

https://doi

*IncomingPh.D.studentatLehighUniversity.

Authors’addresses:YihanCao,yihanc@,LehighUniversity&CarnegieMellonUniversity,Pittsburgh,PA,USA;SiyuLi,applicantlisiyu@,LehighUniversity,Bethlehem,PA,USA;YixinLiu,lis221@,LehighUniversity,Bethlehem,PA,USA;ZhilingYan,zhilingyan724@,LehighUniversity,Bethlehem,PA,USA;YutongDai,lis221@,LehighUniversity,Bethlehem,PA,USA;PhilipS.Yu,UniversityofIllinoisatChicago,Chicago,Illinois,USA,psyu@;LichaoSun,lis221@,LehighUniversity,Bethlehem,PA,USA.

Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprofitorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationonthefirstpage.CopyrightsforcomponentsofthisworkownedbyothersthanACMmustbehonored.Abstractingwithcreditispermitted.Tocopyotherwise,orrepublish,topostonserversortoredistributetolists,requirespriorspecificpermissionand/orafee.Requestpermissionsfrompermissions@.

?2018AssociationforComputingMachinery.

0004-5411/2018/8-ART111$15.00

.org/XXXXXXX.XXXXXXX

https://doi

J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.

111

111:2YihanCao,SiyuLi,YixinLiu,ZhilingYan,YutongDai,PhilipS.Yu,andLichaoSun

1INTRODUCTION

Inrecentyears,ArtificialIntelligenceGeneratedContent(AIGC)hasgainedmuchattentionbeyondthecomputersciencecommunity,wherethewholesocietybeginstobeinterestedinthevariouscontentgenerationproductsbuiltbylargetechcompanies[

3

],suchasChatGPT[

4

]andDALL-E-2[

5

].AIGCreferstocontentthatisgeneratedusingadvancedGenerativeAI(GAI)techniques,asopposedtobeingcreatedbyhumanauthors,whichcanautomatethecreationoflargeamountsofcontentinashortamountoftime.Forexample,ChatGPTisalanguagemodeldevelopedbyOpenAIforbuildingconversationalAIsystems,whichcanefficientlyunderstandandrespondtohumanlanguageinputsinameaningfulway.Inaddition,DALL-E-2isanotherstate-of-the-artGAImodelalsodevelopedbyOpenAI,whichiscapableofcreatinguniqueandhigh-qualityimagesfromtextualdescriptionsinafewminutes,suchas"anastronautridingahorseinaphotorealisticstyle"asshowninFigure

1

.AstheremarkableachievementsinAIGC,manypeoplebelieveitwillbetheneweraofAIandmakesignificantimpactsonthewholeworld.

Figure1

Instruction1:

Anastronautridingahorseinaphotorealisticstyle.

Instruction2:

TeddybearsworkingonnewAIresearchonthemooninthe1980s.

Figure2

DALL·E2

Fig.1.ExamplesofAIGCinimagegeneration.TextinstructionsaregiventoOpenAIDALL-E-2model,anditgeneratestwoimagesaccordingtotheinstructions.

Technically,AIGCrefersto,givenhumaninstructionswhichcouldhelpteachandguidethemodeltocompletethetask,utilizingGAIalgorithmstogeneratecontentthatsatisfiestheinstruction.Thisgenerationprocessusuallyconsistsoftwosteps:extractingintentinformationfromhumaninstructionsandgeneratingcontentaccordingtotheextractedintentions.However,theparadigmofGAImodelscontainingtheabovetwostepsisnotentirelynovel,asdemonstratedbypreviousstudies[

6

,

7

].ThecoreadvancementsinrecentAIGCcomparedtopriorworksaretheresultoftrainingmoresophisticatedgenerativemodelsonlargerdatasets,usinglargerfoundationmodelarchitectures,andhavingaccesstoextensivecomputationalresources.Forexample,themainframeworkofGPT-3maintainsthesameasGPT-2,butthepre-trainingdatasizegrowsfromWebText[

8

](38GB)toCommonCrawl[

9

](570GBafterfiltering),andthefoundationmodelsizegrowsfrom1.5Bto175B.Therefore,GPT-3hasbettergeneralizationabilitythanGPT-2onvarioustasks,suchashumanintentextraction.

Inadditiontothebenefitsbroughtbytheincreaseindatavolumeandcomputationalpower,researchersarealsoexploringwaystointegratenewtechnologieswithGAIalgorithms.Forexample,ChatGPTutilizesreinforcementlearningfromhumanfeedback(RLHF)[

10

12

]todeterminethemostappropriateresponseforagiveninstruction,thusimprovingmodel’sreliabilityandaccuracyovertime.ThisapproachallowsChatGPTtobetterunderstandhumanpreferencesinlongdialogues.Meanwhile,incomputervision,stablediffusion[

13

],proposedbyStability.AIin2022,hasalsoshowngreatsuccessinimagegeneration.Unlikepriormethods,generativediffusionmodelscanhelpgeneratehigh-resolutionimagesbycontrollingthetrade-offbetweenexplorationandexploitation,resultinginaharmoniouscombinationofdiversityinthegeneratedimagesandsimilaritytothetrainingdata.

J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.

Prompt

Prompt

Prompt

InstructionI!

ResultR!

ResultR2

InstructionI2

InstructionI3

ResultR3

ResultR4

AComprehensiveSurveyofAI-GeneratedContent(AIGC):

AHistoryofGenerativeAIfromGANtoChatGPT

111:3

Unimodal

Pleasewriteastoryaboutacat.

Multimodal

Describethispicture.

Drawapictureofacat.

Writeasongaboutacat.

Data

Pre-train

Prompt

Decode

GenerativeAIModels

Data

Pre-train

GenerativeAIModels

InstructionI4

Onceuponatime,

therewasacat

named

Jessy….

Thisisacat.

Fig.2.OverviewofAIGC.Generally,GAImodelscanbecategorizedintotwotypes:unimodalmodelsandmultimodalmodels.Unimodalmodelsreceiveinstructionsfromthesamemodalityasthegeneratedcontentmodality,whereasmultimodalmodelsacceptcross-modalinstructionsandproduceresultsofdifferentmodalities.

Bycombiningtheseadvancements,modelshavemadesignificantprogressinAIGCtasksandhavebeenadoptedinvariousindustries,includingart[

14

],advertising[

15

],andeducation[

16

].Inthenearfuture,AIGCwillcontinuetobeasignificantareaofresearchinmachinelearning.Itisthereforecrucialtoconductanextensivereviewofpastresearchandidentifytheopenproblemsinthisfield.Thissurveyisthefirstonethatfocusesonthecoretechnologiesandapplicationsinthe

fieldofAIGC.

1.1MajorContributions

ThisisthefirstcomprehensivesurveyofAIGCthatsummarizesGAIintheaspectsoftechniquesandapplications.PrevioussurveyshavefocusedonGAIfromvariousangles,includingnaturallanguagegeneration[

17

],imagegeneration[

18

],generationinmultimodalmachinelearning[

7

,

19

].However,thesepriorworksonlyfocusonaspecificpartofAIGC.Inthissurvey,wefirstprovideareviewoffoundationtechniquescommonlyusedinAIGC.Then,wefurtherofferathoroughsummaryofadvancedGAIalgorithms,bothintermsofunimodalgenerationandmultimodalgeneration,asshowninFigure

2

.Inaddition,weexaminetheapplicationsandpotentialchallengesofAIGC.Finally,wehighlighttheopenproblemsandfuturedirectionsinthisfield.Insummary,themaincontributionsofthispaperareasfollows:

?Toourbestknowledge,wearethefirsttoprovideaformaldefinitionandathoroughsurveyforAIGCandAI-enhancedgenerationprocess.

?Wereviewthehistory,foundationtechniquesofAIGCandconductacomprehensiveanalysisofrecentadvancesinGAItasksandmodelsfromtheperspectiveofunimodalgenerationandmultimodalgeneration.

?WediscussthemainchallengesfacingAIGCandfutureresearchtrendsconfrontingAIGC.

1.2Organization

Therestofthesurveyisorganizedasfollows.Section

2

reviewsthehistoryofAIGCmainlyfromtheviewofvisionandlanguagemodalities.Section

3

introducesthebasiccomponentsthatarewidelyusedinnowadaysGAImodeltraining.Section

4

summarizesrecentadvancesofGAImodels,amongwhich,Section

4.1

reviewstheadvancesfromunimodalperspectiveandSection

4.2

reviews

J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.

111:4YihanCao,SiyuLi,YixinLiu,ZhilingYan,YutongDai,PhilipS.Yu,andLichaoSun

theadvancesfromtheperspectiveofmultimodalgeneration.Amongmultimodalgeneration,weintroducevisionlanguagemodels,textaudiomodels,textgraphmodelsandtextcodemodels.Section

5

andSection

6

introducetheapplicationsofGAImodelsinAIGCandsomeotherimportantresearchthatarerelatedtothisarea.Furthermore,Sections

7

,

8

revealtherisk,openproblemsandfuturedirectionsofAIGCtechnologies.Finally,weconcludeourresearchin

9

.

2HISTORYOFGENERATIVEAI

Generativemodelshavealonghistoryinartificialintelligence,datingbacktothe1950swiththedevelopmentofHiddenMarkovModels(HMMs)[

20

]andGaussianMixtureModels(GMMs)[

21

].Thesemodelsgeneratedsequentialdatasuchasspeechandtimeseries.However,itwasn’tuntiltheadventofdeeplearningthatgenerativemodelssawsignificantimprovementsinperformance.

Inearlyyearsofdeepgenerativemodels,differentareasdonothavemuchoverlapingeneral.Innaturallanguageprocessing(NLP),atraditionalmethodtogeneratesentencesistolearnworddistributionusingN-gramlanguagemodeling[

22

]andthensearchforthebestsequence.However,thismethodcannoteffectivelyadapttolongsentences.Tosolvethisproblem,recurrentneuralnetworks(RNNs)[

23

]werelaterintroducedforlanguagemodelingtasks,allowingformodelingrelativelylongdependency.ThiswasfollowedbythedevelopmentofLongShort-TermMemory(LSTM)[

24

]andGatedRecurrentUnit(GRU)[

25

],whichleveragedgatingmechanismtocontrolmemoryduringtraining.Thesemethodsarecapableofattendingtoaround200tokensinasample[

26

],whichmarksasignificantimprovementcomparedtoN-gramlanguagemodels.

Meanwhile,incomputervision(CV),beforetheadventofdeeplearning-basedmethods,tra-ditionalimagegenerationalgorithmsusedtechniquessuchastexturesynthesis[

27

]andtexturemapping[

28

].Thesealgorithmswerebasedonhand-designedfeatures,andwerelimitedintheirabil-itytogeneratecomplexanddiverseimages.In2014,GenerativeAdversarialNetworks(GANs)[

29

]wasfirstproposed,whichwasasignificantmilestoneinthisarea,duetoitsimpressiveresultsinvariousapplications.VariationalAutoencoders(VAEs)[

30

]andothermethodslikediffusiongenerativemodels[

31

]havealsobeendevelopedformorefine-grainedcontrolovertheimagegenerationprocessandtheabilitytogeneratehigh-qualityimages.

Theadvancementofgenerativemodelsinvariousdomainshasfolloweddifferentpaths,buteventually,theintersectionemerged:thetransformerarchitecture[

32

].IntroducedbyVaswanietal.forNLPtasksin2017,TransformerhaslaterbeenappliedinCVandthenbecomethedominantbackboneformanygenerativemodelsinvariousdomains[

9

,

33

,

34

].InthefieldofNLP,manyprominentlargelanguagemodels,e.g.,BERTandGPT,adoptthetransformerarchitectureastheirprimarybuildingblock,offeringadvantagesoverpreviousbuildingblocks,i.e.,LSTMandGRU.InCV,VisionTransformer(ViT)[

35

]andSwinTransformer[

36

]latertakesthisconceptevenfurtherbycombiningthetransformerarchitecturewithvisualcomponents,allowingittobeappliedtoimagebaseddownstreams.Exceptfortheimprovementthattransformerbroughttoindividualmodalities,thisintersectionalsoenabledmodelsfromdifferentdomainstobefusedtogetherformultimodaltasks.OnesuchexampleofmultimodalmodelsisCLIP[

37

].CLIPisajointvision-languagemodelthatcombinesthetransformerarchitecturewithvisualcomponents,allowingittobetrainedonamassiveamountoftextandimagedata.Sinceitcombinesvisualandlanguageknowledgeduringpre-training,itcanalsobeusedasimageencodersinmultimodalpromptingforgeneration.Inall,theemergenceoftransformerbasedmodelsrevolutionizedAIgenerationandledtothepossibilityoflarge-scaletraining.

Inrecentyears,researchershavealsobeguntointroducenewtechniquesbasedonthesemodels.Forinstance,inNLP,insteadoffine-tuning,peoplesometimespreferfew-shotprompting[

38

],whichreferstoincludingafewexamplesselectedfromthedatasetintheprompt,tohelpthemodelbetterunderstandtaskrequirements.Andinvisuallanguage,researchersoftencombine

J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.

DDPM

StyleGAN

BigBiGAN

AComprehensiveSurveyofAI-GeneratedContent(AIGC):

AHistoryofGenerativeAIfromGANtoChatGPT

111:5

CV

CV

CV

Unimodal-CV&NLP

CV

BiGAN

RevNet

NLP

SparrowchatGPT

LSTM/GRU

2016

CLIPALBEFBLIPVQ-GAN

VisualBERTViLBERTUNITER

NLP

GPT-3OPTBART

T5

NLP

ELMOBERTGPT-2

DALL-E

BLIP2

DALL-E2

NLP

Transformer

Show-Tell

VL

CAVPDMGANVQ-VAE

StyleNetStackGAN

NLP

N-Gram

GANVAEFlow

ViTMoCo

2014

2018

2020

NLP

VL

VLVLVLVL

Multimodal–VisionLanguage

Fig.3.ThehistoryofGenerativeAIinCV,NLPandVL.

modality-specificmodelswithself-supervisedcontrastivelearningobjectivestoprovidemorerobustrepresentations.

Inthefuture,asAIGCbecomesincreasinglyimportant,moreandmoretechnologiesshallbeintroduced,empoweringthisareawithvitality.

3FOUNDATIONSFORAIGC

Inthissection,weintroducefoundationmodelsthatarecommonlyusedinAIGC.

3.1FoundationModel

3.1.1Transformer.Transformeristhebackbonearchitectureformanystate-of-the-artmodels,suchasGPT-3[

9

],DALL-E-2[

5

],Codex[

2

],andGopher[

39

].ItwasfirstproposedtosolvethelimitationsoftraditionalmodelssuchasRNNsinhandlingvariable-lengthsequencesandcontext-awareness.Transformerarchitectureismainlybasedonaself-attentionmechanismthatallowsthemodeltoattendtodifferentpartsinainputsequence.Transformerconsistsofanencoderandadecoder.Theencodertakesintheinputsequenceandgenerateshiddenrepresentations,whilethedecodertakesinthehiddenrepresentationandgeneratesoutputsequence.Eachlayeroftheencoderanddecoderconsistsofamulti-headattentionandafeed-forwardneuralnetwork.Themulti-headattentionisthecorecomponentofTransformer,whichlearnstoassigndifferentweightstotokensaccordingtheirrelevance.Thisinformationroutingmethodallowsthemodeltobebetterathandlinglongtermdependency,hence,improvingtheperformanceinawiderangeofNLPtasks.Anotheradvantageoftransformeristhatitsarchitecturemakesithighlyparallelizable,andallowsdatatotrumpinductivebiases[

40

].Thispropertymakestransformerwell-suitedforlarge-scalepre-training,enablingtransformerbasedmodelstobecomeadaptabletodifferentdownstreamtasks.

3.1.2Pre-trainedLanguageModels.SincetheintroductionoftheTransformerarchitecture,ithasbecomethedominantchoiceinnaturallanguageprocessingduetoitsparallelismandlearningcapabilities.Generally,thesetransformerbasedpre-trainedlanguagemodelscanbecommonlyclassifiedintotwotypesbasedontheirtrainingtasks:autoregressivelanguagemodelingandmaskedlanguagemodeling[

41

].Givenasentence,whichiscomposedofseveraltokens,theobjectiveofmaskedlanguagemodeling,e.g.,BERT[

42

]andRoBERTa[

43

],referstopredictingtheprobabilityofamaskedtokengivencontextinformation.ThemostnotableexampleofmaskedlanguagemodelingisBERT[

42

],whichincludesmaskedlanguagemodelingandnextsentence

J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.

T1

TN

Trm

Trm

Trm

Trm

E1

EN

T1

T2

Trm

Trm

Trm

Trm

E1

E2

T1

T2

Trm

Trm

Trm

Trm

Trm

Trm

E1

E2

111:6YihanCao,SiyuLi,YixinLiu,ZhilingYan,YutongDai,PhilipS.Yu,andLichaoSun

T2

Trm

Trm

E2

Encoder(BERT)

TN

Trm

Trm

EN

Decoder(GPT)

TN

Trm

Trm

Trm

EN

Encoder-Decoder(T5/BART)

Fig.4.Categoriesofpre-trainedLLMs.Blacklinerepresentsinformationflowinbidirectionalmodels,whilegraylinerepresentasleft-to-rightinformationflow.Encodermodels,e.g.BERT,aretrainedwithcontext-awareobjectives.Decodermodels,e.g.GPT,aretrainedwithautoregressiveobjectives.Encoder-decodermodels,e.g.T5andBART,combinesthetwo,whichusecontext-awarestructuresasencodersandleft-to-rightstructuresasdecoders.

predictiontasks.RoBERTa[

43

],whichusesthesamearchitectureasBERT,improvesitsperformancebyincreasingtheamountofpre-trainingdataandincorporatingmorechallengingpre-trainingobjectives.XL-Net[

44

],whichisalsobasedonBERT,incorporatespermutationoperationstochangethepredictionorderforeachtrainingiteration,allowingthemodeltolearnmoreinformationacrosstokens.Whileautoregressivelanguagemodeling,e.g.,GPT-3[

9

]andOPT[

45

],istomodeltheprobabilityofthenexttokengivenprevioustokens,hence,left-to-rightlanguagemodeling.Differentfrommaskedlanguagemodels,autoregressivemodelsaremoresuitableforgenerativetasks.WewillintroducemoreaboutautoregressivemodelsinSection

4.1.1

.

3.2ReinforcementLearningfromHumanFeedback

Despitebeingtrainedonlarge-scaledata,theAIGCmaynotalwaysproduceoutputthatalignswiththeuser’sintent,whichincludesconsiderationsofusefulnessandtruthfulness.InordertobetteralignAIGCoutputwithhumanpreferences,reinforcementlearningfromhumanfeedback(RLHF)hasbeenappliedtofine-tunemodelsinvariousapplicationssuchasSparrow,InstructGPT,andChatGPT[

10

,

46

].

Typically,thewholepipelineofRLHFincludesthefollowingthreesteps:pre-training,rewardlearning,andfine-tuningwithreinforcementlearning.First,alanguagemodel0ispre-trainedonlarge-scaledatasetsasaninitiallanguagemodel.Sincethe(prompt-answer)pairgivenby0mightnotalignwithhumanpurposes,inthesecondstepwetrainarewardmodeltoencodethediversifiedandcomplexhumanpreference.Specifically,giventhesamepromptx,differentgeneratedanswers{g1,g2,···,g3}areevaluatedbyhumansinapairwisemanner.Thepairwisecomparisonrelationshipsarelatertransferredtopointwiserewardscalars,{r1,r2,···,r3},usinganalgorithmsuchasELO[

47

].Inthefinalstep,thelanguagemodelisfine-tunedtomaximizethelearnedrewardfunctionusingreinforcementlearning.TostabilizetheRLtraining,ProximalPolicyOptimization(PPO)isoftenusedastheRLalgorithm.IneachepisodeofRLtraining,anempirically-estimatedKLpenaltytermisconsideredtopreventthemodelfromoutputtingsomethingpeculiartotricktherewardmodel.Specifically,thetotalrewardrtotalateachstepisgivenbyrtotal(x,g)=rRM(x,g)?入KLDKL(幾e|幾e0),whererRMisthelearnedrewardmodel,DKListheKLpenaltyterm,and幾·isthetrainedpolicy.FormoredetailsonRLHF,pleasereferto[

48

].

J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.

5

Imagen

GLIDE

RoBERTa

BA

TrainingSpeed(basedonV10016G)

Switch

7x

GPT-3

PaLM

BL

OMChatG

PT

T

Megatron

DALL-E

5x

3x

GPT-2

DALL-E-2

1x

GPT

BERT

VisualBERT

ERNIE

RT

CLIP

9x

O

AComprehensiveSurveyofAI-GeneratedContent(AIGC):

AHistoryofGenerativeAIfromGANtoChatGPT

111:7

#Parameters1T

100B

10B

1B

100M

2018

RTX

V100

2019RTX

RTX

2020

A100

A100

2021

H100

2022

H100

2023

8000

16G

3090

4090

40G

80G

80GGen5

80GSXM5

Fig.5.Statisticsofmodelsize[

52

]andtrainingspeed

1

acrossdifferentmodelsandcomputingdevices.

AlthoughRLHFhasshownpromisingresultsbyincorporatingfluency,progressinthisfieldisimpededbyalackofpubliclyavailablebenchmarksandimplementationresources,leadingtoaperceptionthatRLisachallengingapproachforNLP.Toaddressthisissue,anopen-sourcelibrarynamedRL4LMs[

49

]hasrecentlybeenintroduced,consistingofbuildingblocksforfine-tuningandevaluatingRLalgorithmsonLM-basedgeneration.

Beyondhumanfeedback,thelatestdialogueagent,Claude,favorsConstitutionalAI[

50

],wheretherewardmodelislearnedviaRLfromAIFeedback(RLAIF).BoththecritiquesandtheAIfeedbackareguidedbyasmallsetofprinciplesdrawnfroma“constitution”,whichistheonlythingprovidedbyhumansinClaude.TheAIfeedbackfocusesoncontrollingtheoutputstobelessharmfulbyexplainingitsobjectionstodangerousqueries.Moreover,recentlyapreliminarytheoreticalanalysisoftheRLAIF[

51

]justifiestheempiricalsuccessofRLHFandprovidesnewinsightsforspecializedRLHFalgorithmdesignforlanguagemodels.

3.3Computing

3.3.1Hardware.Inrecentyears,therehavebeensignificanthardwareadvancementsthathavefacilitatedthetrainingoflarge-scalemodels.Inthepast,trainingalargeneuralnetworkusingCPUscouldtakeseveraldaysorevenweeks.However,withtheemergenceofmorepowerfulcomputingresources,thisprocesshasbeenacceleratedbyseveralordersofmagnitude.Forinstance,theNVIDIAA100GPUachievesseventimesfasterduringBERT-largeinferencecomparedtotheV100and11timesfasterthanthe

T42.

Additionally,Google’sTensorProcessingUnits(TPUs),whicharedesignedspecificallyfordeeplearning,offerevenhighercomputingperformancecomparedtothecurrentgenerationofA100GPUs

3

.ThisrapidprogressincomputingpowerhassignificantlyincreasedtheefficiencyofAImodeltrainingandopened

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論