Python 中的幾何學(xué)習(xí)簡(jiǎn)介地質(zhì)測(cè)量?jī)x_第1頁
Python 中的幾何學(xué)習(xí)簡(jiǎn)介地質(zhì)測(cè)量?jī)x_第2頁
Python 中的幾何學(xué)習(xí)簡(jiǎn)介地質(zhì)測(cè)量?jī)x_第3頁
Python 中的幾何學(xué)習(xí)簡(jiǎn)介地質(zhì)測(cè)量?jī)x_第4頁
Python 中的幾何學(xué)習(xí)簡(jiǎn)介地質(zhì)測(cè)量?jī)x_第5頁
已閱讀5頁,還剩6頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

IntroductiontoGeometricLearninginPythonwithGeomstats

NinaMiolane,NicolasGuigui,HadiZaatiti,ChristianShewmake,HatemHajri,DanielBrooks,AliceLeBrigant,JohanMathe,BenjaminHou,YannThanwerdas,etal.

Tocitethisversion:

NinaMiolane,NicolasGuigui,HadiZaatiti,ChristianShewmake,HatemHajri,etal..IntroductiontoGeometricLearninginPythonwithGeomstats.SciPy2020-19thPythoninScienceConference,Jul2020,Austin,Texas,UnitedStates.pp.48-57,10.25080/Majora-342d178e-007.hal-02908006

HALId:hal-02908006

https://inria.hal.science/hal-02908006

Submittedon28Jul2020

HALisamulti-disciplinaryopenaccessarchiveforthedepositanddisseminationofsci-entificresearchdocuments,whethertheyarepub-lishedornot.ThedocumentsmaycomefromteachingandresearchinstitutionsinFranceorabroad,orfrompublicorprivateresearchcenters.

L’archiveouvertepluridisciplinaireHAL,estdestinéeaudép?tetàladiffusiondedocumentsscientifiquesdeniveaurecherche,publiésounon,émanantdesétablissementsd’enseignementetderecherchefran?aisouétrangers,deslaboratoirespublicsouprivés.

PAGE

48

PROC.OFTHE19thPYTHONINSCIENCECONF.(SCIPY2020)

INTRODUCTIONTOGEOMETRICLEARNINGINPYTHONWITHGEOMSTATS

PAGE

49

IntroductiontoGeometricLearninginPythonwithGeomstats

NinaMiolane??,NicolasGuigui§,HadiZaatiti,ChristianShewmake,HatemHajri,DanielBrooks,AliceLeBrigant,JohanMathe,BenjaminHou,YannThanwerdas,StefanHeyder,OlivierPeltre,NiklasKoep,YannCabanes,ThomasGerald,PaulChauchat,BernhardKainz,ClaireDonnat,SusanHolmes,XavierPennec

https://youtu.be/Ju-Wsd84uG0

Abstract—Thereisagrowinginterestinleveragingdifferentialgeometryinthemachinelearningcommunity.Yet,theadoptionoftheassociatedgeometriccomputationshasbeeninhibitedbythelackofareferenceimplementation.Suchanimplementationshouldtypicallyallowitsusers:(i)togetintuitiononconceptsfromdifferentialgeometrythroughahands-onapproach,oftennotprovidedbytraditionaltextbooks;and(ii)torungeometricmachinelearningalgorithmsseamlessly,withoutdelvingintothemathematicaldetails.Toaddressthisgap,wepresenttheopen-sourcePythonpackagegeomstatsandintro-ducehands-ontutorialsfordifferentialgeometryandgeometricmachinelearn-ingalgorithms-GeometricLearning-thatrelyonit.Codeanddocumentation:/geomstats/geomstatsandgeomstats.ai.

IndexTerms—differentialgeometry,statistics,manifold,machinelearning

Introduction

Dataonmanifoldsarisenaturallyindifferentfields.Hyperspheresmodeldirectionaldatainmolecularandproteinbiology[

KH05

]andsomeaspectsof3Dshapes[

JDM12

],[

HVS+16

].Densityesti-mationonhyperbolicspacesarisestomodelelectricalimpedances[

HKKM10

],networks[

AS14

],orreflectioncoefficientsextractedfromaradarsignal[

CBA15

].SymmetricPositiveDefinite(SPD)matricesareusedtocharacterizedatafromDiffusionTensorImaging(DTI)[

PFA06

],[

YZLM12

]andfunctionalMagneticResonanceImaging(fMRI)[

STK05

].Thesemanifoldsarecurved,differentiablegeneralizationsofvectorspaces.Learningfromdataonmanifoldsthusrequirestechniquesfromthemathematicaldisciplineofdifferentialgeometry.Asaresult,thereisagrowinginterestinleveragingdifferentialgeometryinthemachinelearningcommunity,supportedbythefieldsofGeometricLearningandGeometricDeepLearning[

BBL+17

].

Despitethisneed,theadoptionofdifferentialgeometriccomputationshasbeeninhibitedbythelackofareferenceimplementation.Projectsimplementingcodeforgeometrictoolsareoftencustom-builtforspecificproblemsandarenoteasilyreused.SomePythonpackagesdoexist,buttheymainlyfocusonoptimization(Pymanopt[

TKW16

],Geoopt[

BG18

],[

Koc19

],

*Correspondingauthor:

nmiolane@

?StanfordUniversity

§UniversitéC?ted’Azur,Inria

Copyright?2020NinaMiolaneetal.Thisisanopen-accessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense,whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedtheoriginalauthorandsourcearecredited.

McTorch[

MJK+18

]),arededicatedtoasinglemanifold(PyRie-mann[

Bar15

],PyQuaternion[

Wyn14

],PyGeometry[

Cen12

]),orlackunit-testsandcontinuousintegration(TheanoGeometry[

KS17

]).Anopen-source,low-levelimplementationofdifferentialgeometryandassociatedlearningalgorithmsformanifold-valueddataisthusthoroughlywelcome.

Geomstatsisanopen-sourcePythonpackagebuiltformachinelearningwithdataonnon-linearmanifolds[

MGLB+

]:afieldcalledGeometricLearning.Thelibraryprovidesobject-orientedandextensivelyunit-testedimplementationsofessentialmanifolds,operations,andlearningmethodswithsupportfordifferentexecutionbackends-namelyNumPy,PyTorch,andTensorFlow.Thispaperillustratestheuseofgeomstatsthroughhands-onintroductorytutorialsofGeometricLearning.Thesetu-torialsenableusers:(i)tobuildintuitionfordifferentialgeometrythroughahands-onapproach,oftennotprovidedbytraditionaltextbooks;and(ii)torungeometricmachinelearningalgorithmsseamlesslywithoutdelvingintothelower-levelcomputationalormathematicaldetails.Weemphasizethatthetutorialsarenotmeanttoreplacetheoreticalexpositionsofdifferentialgeometryandgeometriclearning[

Pos01

],[

PSF19

].Rather,theywillcom-plementthemwithanintuitive,didactic,andengineering-orientedapproach.

PresentationofGeomstats

Thepackage

geomstats

isorganizedintotwomainmodules:

geometry

and

learning

.Themodulegeometryimplementslow-leveldifferentialgeometrywithanobject-orientedparadigmandtwomainparentclasses:ManifoldandRiemannianMetric.StandardmanifoldsliketheHypersphereortheHyperbolicspaceareclassesthatinheritfromManifold.Atthetimeofwriting,thereareover15manifoldsimplementedingeomstats.TheclassRiemannianMetricprovidescomputationsrelatedtoRiemanniangeometryonsuchmanifoldssuchastheinnerproductoftwotangentvectorsatabasepoint,thegeodesicdistancebetweentwopoints,theExponentialandLogarithmmapsatabasepoint,andmanyothers.

Themodulelearningimplementsstatisticsandmachinelearningalgorithmsfordataonmanifolds.Thecodeisobject-orientedandclassesinheritfromscikit-learnbaseclassesandmixinssuchasBaseEstimator,ClassifierMixin,orRegressorMixin.Thismoduleprovidesimplementations

ofFréchetmeanestimators,K-means,andprincipalcomponentanalysis(PCA)designedformanifolddata.Thealgorithmscanbeappliedseamlesslytothedifferentmanifoldsimplementedinthelibrary.

Thecodefollowsinternationalstandardsforreadabilityandeaseofcollaboration,isvectorizedforbatchcomputations,un-dergoesunit-testingwithcontinuousintegration,andincorporatesbothTensorFlowandPyTorchbackendstoallowforGPUac-celeration.Thepackagecomeswitha

visualization

modulethatenablesuserstovisualizeandfurtherdevelopanintuitionfordifferentialgeometry.Inaddition,the

datasets

moduleprovidesinstructivetoydatasetsonmanifolds.Therepositories

examples

and

notebooks

provideconvenientstartingpointstogetfamiliarwithgeomstats.

FirstSteps

Tobegin,weneedtoinstallgeomstats.Wefollowthein-stallationproceduredescribedinthe

firststeps

oftheonlinedocumentation.Next,inthecommandline,wechoosethebackendofinterest:NumPy,PyTorchorTensorFlow.Then,weopentheiPythonnotebookandimportthebackendtogetherwiththevisualizationmodule.Inthecommandline:

exportGEOMSTATS_BACKEND=numpy

then,inthenotebook:

importgeomstats.backendasgs

importgeomstats.visualizationasvisualization

visualization.tutorial_matplotlib()

INFO:Usingnumpybackend

Modulesrelatedtomatplotlibandloggingshouldbeim-portedduringsetuptoo.Moredetailsonsetupcanbefoundonthedocumentationwebsite:geomstats.ai.AllstandardNumPyfunctionsshouldbecalledusingthegs.prefix-e.g.gs.exp,gs.log-inordertoautomaticallyusethebackendofinterest.

Tutorial:StatisticsandGeometricStatistics

ThistutorialillustrateshowGeometricStatisticsandLearningdif-ferfromtraditionalStatistics.Statisticaltheoryisusuallydefinedfordatabelongingtovectorspaces,whicharelinearspaces.Forexample,weknowhowtocomputethemeanofasetofnumbersorofmultidimensionalarrays.

Nowconsideranon-linearspace:amanifold.AmanifoldMofdimensionmisaspacethatispossiblycurvedbutthatlookslikeanm-dimensionalvectorspaceinasmallneighborhoodofeverypoint.Asphere,liketheearth,isagoodexampleofamanifold.Whathappenswhenweapplystatisticaltheorydefinedforlinearvectorspacestodatathatdoesnotnaturallybelongtoalinearspace?Forexample,whathappensifwewanttoperformstatisticsonthecoordinatesofworldcitieslyingontheearth’ssurface:asphere?Letuscomputethemeanoftwodatapointsonthesphereusingthetraditionaldefinitionofthemean.

fromgeomstats.geometry.hypersphereimport\Hypersphere

n_samples=2

sphere=Hypersphere(dim=2)points_in_manifold=sphere.random_uniform(

n_samples=n_samples)

PointsLinearmean

PointsFréchetmean

Fig.1:Left:Linearmeanoftwopointsonthesphere.Right:Fréchetmeanoftwopointsonthesphere.Thelinearmeandoesnotbelongtothesphere,whiletheFréchetmeandoes.Thisillustrateshowlinearstatisticscanbegeneralizedtodataonmanifolds,suchaspointsonthesphere.

linear_mean=gs.sum(

points_in_manifold,axis=0)/n_samples

TheresultisshowninFigure

1

(left).Whathappened?Themeanoftwopointsonamanifold(thesphere)isnotonthemanifold.Inourexample,themeanofthesecitiesisnotontheearth’ssurface.Thisleadstoerrorsinstatisticalcomputations.Thelinesphere.belongs(linear_mean)returnsFalse.Forthisreason,researchersaimtobuildatheoryofstatisticsthatis-byconstruction-compatiblewithanystructurewithwhichweequipthemanifold.ThistheoryiscalledGeometricStatistics,andtheassociatedlearningalgorithms:GeometricLearning.

Inthisspecificexampleofmeancomputation,GeometricStatisticsprovidesageneralizationofthedefinitionof“mean”tomanifolds:theFréchetmean.

fromgeomstats.learning.frechet_meanimport\FrechetMean

estimator=FrechetMean(metric=sphere.metric)estimator.fit(points_in_manifold)frechet_mean=estimator.estimate_

NoticeinthiscodesnippetthatgeomstatsprovidesclassesandmethodswhoseAPIwillbeinstantlyfamiliartousersofthewidely-adoptedscikit-learn.WeplottheresultinFigure

1

(right).ObservethattheFréchetmeannowbelongstothesurfaceofthesphere!

Beyondthecomputationofthemean,geomstatsprovidesstatisticsandlearningalgorithmsonmanifoldsthatleveragetheirspecificgeometricstructure.Suchalgorithmsrelyonelementaryoperationsthatareintroducedinthenexttutorial.

Tutorial:ElementaryOperationsforDataonManifolds

Theprevioustutorialshowedwhyweneedtogeneralizetradi-tionalstatisticsfordataonmanifolds.Thistutorialshowshowtoperformtheelementaryoperationsthatallowusto“translate”learningalgorithmsfromlinearspacestomanifolds.

Weimportdatathatlieonamanifold:the

worldcities

dataset,thatcontainscoordinatesofcitiesontheearth’ssurface.WevisualizeitinFigure

2

.

importgeomstats.datasets.utilsasdata_utils

data,names=data_utils.load_cities()

ParisMoscow

Istanbul

BeijingManilla

InitialpointEndpointGeodesic

Fig.2:Subsetoftheworldcitiesdataset,availableingeomstatswiththefunctionload_citiesfromthemoduledatasets.utils.Cities’coordinatesaredataonthesphere,whichisanexampleofamanifold.

Howcanwecomputewithdatathatlieonsuchamanifold?Theelementaryoperationsonavectorspaceareadditionandsubtraction.Inavectorspace(infactseenasanaffinespace),wecanaddavectortoapointandsubtracttwopointstogetavector.Canwegeneralizetheseoperationsinordertocomputeonmanifolds?

Forpointsonamanifold,suchasthesphere,thesameoperationsarenotpermitted.Indeed,addingavectortoapointwillnotgiveapointthatbelongstothemanifold:inFigure

3

,addingtheblacktangentvectortothebluepointgivesapointthatisoutsidethesurfaceofthesphere.So,weneedtogeneralizetomanifoldstheoperationsofadditionandsubtraction.

Onmanifolds,theexponentialmapistheoperationthatgeneralizestheadditionofavectortoapoint.Theexponentialmaptakesthefollowinginputs:apointandatangentvectortothemanifoldatthatpoint.TheseareshowninFigure

3

usingthebluepointanditstangentvector,respectively.Theexponentialmapre-turnsthepointonthemanifoldthatisreachedby“shooting”withthetangentvectorfromthepoint.“Shooting”meansfollowinga“geodesic”onthemanifold,whichisthedottedpathinFigure

3

.Ageodesic,roughly,istheanalogofastraightlineforgeneralmanifolds-thepathwhose,length,orenergy,isminimalbetweentwopoints,wherethenotionsoflengthandenergyaredefinedbytheRiemannianmetric.Thiscodesnippetshowshowtocomputetheexponentialmapandthegeodesicwithgeomstats.

fromgeomstats.geometry.hypersphereimport\Hypersphere

sphere=Hypersphere(dim=2)

initial_point=paris=data[19]vector=gs.array([1,0,0.8])tangent_vector=sphere.to_tangent(

vector,base_point=initial_point)

end_point=sphere.metric.exp(tangent_vector,base_point=initial_point)

geodesic=sphere.metric.geodesic(initial_point=initial_point,initial_tangent_vec=tangent_vector)

Similarly,onmanifolds,thelogarithmmapistheoperationthatgeneralizesthesubtractionoftwopointsonvectorspaces.Thelogarithmmaptakestwopointsonthemanifoldasinputsandreturnsthetangentvectorrequiredto“shoot”fromonepointto

Fig.3:Elementaryoperationsonmanifoldsillustratedonthesphere.Theexponentialmapattheinitialpoint(bluepoint)shootstheblacktangentvectoralongthegeodesic,andgivestheendpoint(orangepoint).Conversely,thelogarithmmapattheinitialpoint(bluepoint)takestheendpoint(orangepoint)asinput,andoutputstheblacktangentvector.Thegeodesicbetweenthebluepointandtheorangepointrepresentsthepathofshortestlengthbetweenthetwopoints.

theother.Atanypoint,itistheinverseoftheexponentialmap.InFigure

3

,thelogarithmoftheorangepointatthebluepointreturnsthetangentvectorinblack.Thiscodesnippetshowshowtocomputethelogarithmmapwithgeomstats.

log=sphere.metric.log(

point=end_point,base_point=initial_point)

Weemphasizethattheexponentialandlogarithmmapsdependonthe“Riemannianmetric”chosenforagivenmanifold:observeinthecodesnippetsthattheyarenotmethodsofthesphereobject,butratherofitsmetricattribute.TheRiemannianmetricdefinesthenotionofexponential,logarithm,geodesicanddistancebetweenpointsonthemanifold.Wecouldhavechosenadifferentmetriconthespherethatwouldhavechangedthedistancebetweenthepoints:withadifferentmetric,the“sphere”could,forexample,looklikeanellipsoid.

Usingtheexponentialandlogarithmmapsinsteadoflinearadditionandsubtraction,manylearningalgorithmscanbegen-eralizedtomanifolds.Weillustratedtheuseoftheexponentialandlogarithmmapsonthesphereonly;yet,geomstatspro-videstheirimplementationforover15differentmanifoldsinitsgeometrymodulewithsupportforavarietyofRiemannianmetrics.Consequently,geomstatsalsoimplementslearningalgorithmsonmanifolds,takingintoaccounttheirspecificgeo-metricstructurebyrelyingontheoperationswejustintroduced.Thenexttutorialsshowmoreinvolvedexamplesofsuchgeometriclearningalgorithms.

Tutorial:ClassificationofSPDMatrices

Tutorialcontextanddescription

Wedemonstratethatanystandardmachinelearningalgorithmcanbeappliedtodataonmanifoldswhilerespectingtheirgeometry.Intheprevioustutorials,wesawthatlinearoperations(mean,linearweighting,additionandsubtraction)arenotdefinedonmanifolds.However,eachpointonamanifoldhasanassociatedtangentspacewhichisavectorspace.Assuch,inthetangentspace,theseoperationsarewelldefined!Therefore,wecanusethelogarithmmap(seeFigure

3

fromtheprevioustutorial)togofrompointson

manifoldstovectorsinthetangentspaceatareferencepoint.Thisfirststrategyenablestheuseoftraditionallearningalgorithmsonmanifolds.

Asecondstrategycanbedesignedforlearningalgorithms,suchasK-NearestNeighborsclassification,thatrelyonlyondistancesordissimilaritymetrics.Inthiscase,wecancomputethepairwisedistancesbetweenthedatapointsonthemanifold,usingthemethodmetric.dist,andfeedthemtothechosenalgorithm.

Bothstrategiescanbeappliedtoanymanifold-valueddata.Inthistutorial,weconsidersymmetricpositivedefinite(SPD)matri-cesfrombrainconnectomicsdataandperformlogisticregressionandK-NearestNeighborsclassification.

SPDmatricesintheliterature

Beforedivingintothetutorial,letusrecallafewapplicationsofSPDmatricesinthemachinelearningliterature.SPDmatricesareubiquitousacrossmanyfields[

CS16

],eitherasinputoforoutputtoagivenproblem.InDTIforinstance,voxelsarerepresentedby"diffusiontensors"whichare3x3SPDmatricesrepresentingellipsoidsintheirstructure.Theseellipsoidsspatiallycharacterizethediffusionofwatermoleculesinvarioustissues.EachDTIthusconsistsofafieldofSPDmatrices,whereeachpointinspacecorrespondstoanSPDmatrix.Thesematricesthenserveasinputstoregressionmodels.In[

YZLM12

]forexample,theauthorsuseanintrinsiclocalpolynomialregressiontocomparefibertractsbetweenHIVsubjectsandacontrolgroup.Similarly,infMRI,itispossibletoextractconnectivitygraphsfromtimeseriesofpatients’resting-stateimages[

WZD+13

].TheregularizedgraphLaplaciansofthesegraphsformadatasetofSPDmatrices.Thisprovidesacompactsummaryofbrainconnectivitypatternswhichisusefulforassessingneurologicalresponsestoavarietyofstimuli,suchasdrugsorpatient’sactivities.

Moregenerallyspeaking,covariancematricesarealsoSPDmatriceswhichappearinmanysettings.Covarianceclusteringcanbeusedforvariousapplicationssuchassoundcompressioninacousticmodelsofautomaticspeechrecognition(ASR)systems[

SMA10

]orformaterialclassification[

FHP15

],amongothers.Covariancedescriptorsarealsopopularimageorvideodescriptors[

HHLS16

].

Lastly,SPDmatriceshavefoundapplicationsindeeplearning.Theauthorsof[

GWB+19

]showthatanaggregationoflearneddeepconvolutionalfeaturesintoanSPDmatrixcreatesarobustrepresentationofimageswhichoutperformsstate-of-the-artmeth-odsforvisualclassification.

ManifoldofSPDmatrices

LetusrecallthemathematicaldefinitionofthemanifoldofSPDmatrices.ThemanifoldofSPDmatricesinndimensionsisembeddedintheGeneralLineargroupofinvertiblematricesanddefinedas:

SPD={S∈Rn×n:ST=S,?z∈Rn,z/=0,zTSz>0}.

TheclassSPDMatricesSpaceinheritsfromtheclassEmbeddedManifoldandhasanembedding_manifoldattributewhichstoresanobjectoftheclassGeneralLinear.SPDmatricesin2dimensionscanbevisualizedasellipseswithprincipalaxesgivenbytheeigenvectorsoftheSPDma-trix,andthelengthofeachaxisproportionaltothesquare-rootofthecorrespondingeigenvalue.Thisisimplementedinthe

Class1

Class2

Class3

Fig.4:SimulateddatasetofSPDmatricesin2dimensions.Weobserve3classesofSPDmatrices,illustratedwiththecolorsred,green,andblue.Thecentroidofeachclassisrepresentedbyanellipseoflargerwidth.

visualizationmoduleofgeomstats.Wegenerateatoydata-setandplotitinFigure

4

withthefollowingcodesnippet.

importgeomstats.datasets.sample_sdp_2dassampler

n_samples=100

dataset_generator=sampler.DatasetSPD2D(n_samples,n_features=2,n_classes=3)

ellipsis=visualization.Ellipsis2D()

fori,xinenumerate(data):

y=sampler.get_label_at_index(i,labels)ellipsis.draw(

x,color=ellipsis.colors[y],alpha=.1)

Figure

4

showsadatasetofSPDmatricesin2dimensionsorganizedinto3classes.Thisvisualizationhelpsindevelopinganintuitionontheconnectomesdatasetthatisusedintheupcomingtutorial,wherewewillclassifySPDmatricesin28dimensionsinto2classes.

ClassifyingbrainconnectomesinGeomstats

Wenowdelveintothetutorialinordertoillustratetheuseoftraditionallearningalgorithmsonthetangentspacesofmanifoldsimplementedingeomstats.Weusebrainconnectomedatafromthe

MSLP2014SchizophreniaChallenge

.Theconnectomesarecorrelationmatricesextractedfromthetime-seriesofresting-statefMRIsof86patientsat28brainregionsofinterest:theyarepointsonthemanifoldofSPDmatricesinn=28dimensions.Ourgoalistousetheconnectomestoclassifypatientsintotwoclasses:schizophrenicandcontrol.FirstweloadtheconnectomesanddisplaytwoofthemasheatmapsinFigure

5

.

importgeomstats.datasets.utilsasdata_utils

data,patient_ids,labels=\data_utils.load_connectomes()

MultiplemetricscanbeusedtocomputeonthemanifoldofSPDmatrices[

DKZ09

].Asmentionnedintheprevioustutorial,differ-entmetricsdefinedifferentgeodesics,exponentialandlogarithmmapsandthereforedifferentalgorithmsonagivenmanifold.Here,weimporttwoofthemostcommonlyusedmetricsontheSPDmatrices,thelog-Euclideanmetricandtheaffine-invariantmetric[

PFA06

],butwehighlightthatgeomstatscontainsmanymore.WealsocheckthatourconnectomedataindeedbelongstothemanifoldofSPDmatrices:

Schizophrenic Healthy

Correlations

Andwiththeaffine-invariantmetric,replacingle_metricby

ai_metricintheabovesnippet:

INFO:0.71

Weobservethattheresultdependsonthemetric.TheRiemannianmetricindeeddefinesthenotionofthelogarithmmap,which

-0.5

1. isusedtocomputetheFréchetMeanandthetangentvectorscorrespondingtotheinputdatapoints.Thus,changingthemetric

changestheresult.Furthermore,somemetricsmaybemore

Fig.5:Subsetoftheconnectomesdataset,availableingeomstatswiththefunctionload_connectomesfromthemod-uledatasets.utils.Connectomesarecorrelationmatricesof28time-seriesextractedfromfMRIdata:theyareelementsofthemanifoldofSPDmatricesin28dimensions.Left:connectomeofaschizophrenicsubject.Right:connectomeofahealthycontrol.

importgeomstats.geometry.spd_matricesasspd

manifold=spd.SPDMatrices(n=28)

le_metric=spd.SPDMetricLogEuclidean(n=28)ai_metric=spd.SPDMetricAffine(n=28)(gs.all(manifold.belongs(data)))

INFO:True

Great!Now,althoughthesumoftwoSPDmatricesisanSPDmatrix,theirdifferenceortheirlinearcombinationwithnon-positiveweightsarenotnecessarily.ThereforeweneedtoworkinatangentspaceoftheSPDmanifoldtoperformsimplemachinelearningthatreliesonlinearoperations.ThepreprocessingmodulewithitsToTangentSpaceclassallowstodoexactlythis.

fromgeomstats.learning.preprocessingimport\ToTangentSpace

ToTangentSpacehasasimplepurpose:itcomputestheFréchetMeanofthedataset,andtakesthelogarithmmapofeachdatapointfromthemean.Thisresultsinadatasetoftangentvectorsatthemean.InthecaseoftheSPDmani-fold,thesearesimplysymmetricmatrices.ToTangentSpacethensqueezeseachsymmetricmatrixintoa1d-vectorofsizedim=28*(28+1)/2,andoutputsanarrayofshape[n_connectomes,dim],whichcanbefedtoyourfavoritescikit-learnalgorithm.

WeemphasizethatToTangentSpacecomputesthemeanoftheinputdata,andthusshouldbeusedinapipeline(ase.g.scikit-learn’sStandardScaler)toavoidleakinginformationfromthetestsetattraintime.

fromsklearn.pipelineimportmake_pipeline

fromsklearn.linear_modelimportLogisticRegression

fromsklearn.model_selectionimportcross_validate

pipeline=make_pipeline(

suitablethanothersfordifferentapplications.Indeed,wefindpublishedresultsthatshowhowusefulgeometrycanbewithdataontheSPDmanifold(e.g[

WAZF18

],[

NDV+14

]).

Wesawhowtousetherepresentationofpointsonthemanifoldastangentvectorsatareferencepointtofitanymachinelearningalgorithm,andwecomparedtheeffectofdifferentmetricsonthemanifoldofSPDmatrices.Anotherclassofmachinelearningal-gorithmscanbeusedveryeasilyonmanifoldswithgeomstats:thoserelyingondissimilaritymatrices.WecancomputethematrixofpairwiseRiemanniandistances,usingthedistmethodoftheRiemannianmetricobject.Inthefollowingcode-snippet,weuseai_metric.distandpassthecorrespondingmatrixpairwise_distofpairwisedistancestoscikit-learn’sK-Nearest-Neighbors(KNN)classificationalgorithm:

fromsklearn.neighborsimportKNeighborsClassifierclassifier=KNeighborsClassifier(

metric='precomputed')

result=cross_validate(

classifier,pairwise_dist,labels)(result['test_score'].mean())

INFO:0.72

Thistutorialshowedhowtoleveragegeomstatstousestandardlearningalgorithmsfordataonamanifold.Inthenexttutorial,weseeamorecomplicatedsituation:thedatapointsarenotprovidedbydefaultaselementsofamanifold.Wewillneedtousethelow-levelgeomstatsoperationstodesignamethodthatembedsthedatasetinthemanifoldofinterest.Onlythen,wecanusealearningalgorithm.

Tutorial:LearningGraphRepresentationswithHyperbolicSpaces

Tutorialcontextanddescription

Thistutorialdemonstrateshowtomakeuseofthelow-levelgeometricoperationsingeomstatstoimplementamethodthatembedsgraphdataintothehyperbolicspace.Thankstothedis-coveryofhyperbolicembeddings,learningonGraph-StructuredData(GSD)hasseenmajorachievementsinrecentyears.Ithadbeenspeculatedforyearsthathyperbolicspacesmaybetterrep-resentGSDthanEuclideanspaces[

Gro87

][

KPK+10

][

BPK10

][

ASM13

].Thesespeculationshaverecentlybeenshown

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

最新文檔

評(píng)論

0/150

提交評(píng)論