An Introduction To Olap In Sql Server 2005.pdf

(49 KB) Pobierz
An Introduction to OLAP in SQL Server 2005
An Introduction to OLAP in SQL Server 2005
Page 1
http://www.devx.com
Printed from http://www.devx.com/dbzone/Article/21410
AnIntroductiontoOLAPinSQLServer2005
GetapreviewoftheupcomingSQLServer2005'sBusinessIntelligencesuiteandfindoutaboutthemajorOLAP
componentsofAnalysisServices.
byGailTieh
fyou'readatabasedeveloper,you'venodoubtheardofOn-LineAnalyticalProcessing(OLAP)andtheadvantagesofanalysisusingmulti-
dimensional,pre-aggregateddata.Maybeyou'veeventhoughtaboutcreatingyourownmultidimensionalcubestogiveyourenduserstruead
hoccapabilities,includingthecreationofcalculatedmeasures/KPIs.Ifyou'verelegatedthattasktothebackburnerbecauseitwastoocomplex,you'll
behappytoknowthatSQL2005hasmadetheprocesseasier.
ThisarticlediscussesthemajorOLAPcomponentsofAnalysisServices,allofwhichcanbeimplementedbyevenafirst-timecubebuilder.Afollow
uparticlebyMarkFrawleywillexaminethedifferencesbetweenAnalysisServicesinSQL2000andSQL2005.
WhyUseOLAP?
OLAPisusefulbecauseitprovidesfastandinteractiveaccesstoaggregateddataandtheabilitytodrilldowntodetail.OLAPletsusersviewand
interrogatelargevolumesofdata(oftenmillionsofrows)bypre-aggregatingtheinformation.Itputsthedataneededtomakestrategicdecisions
directlyintothehandsofthedecisionmakers,notonlythroughpre-definedqueriesandreports,butalsobecauseitgivesenduserstheabilityto
performtheirownadhocqueries,minimizingusers'dependenceondatabasedevelopers.
What'stheSecret?
OLAPleveragesexistingdatafromarelationalschemaordatawarehouse(datasource)byplacingkeyperformanceindicators(measures)into
context(dimensions).Onceprocessedintoamultidimensionaldatabase(cube),allofthemeasuresarepre-aggregated,whichmakesdataretrieval
significantlyfaster.Theprocessedcubecanthenbemadeavailabletobusinessuserswhocanbrowsethedatausingavarietyoftools,makingad
hocanalysisaninteractiveandanalyticalprocessratherthanadevelopmenteffort.SQLServer2005'sBIWorkbenchsubstantiallyimprovesupon
SQLServer2000'sBIcapability.
SQLServer2005BIWorkbenchPlatform
TheSQLServer2005BIWorkbenchsuiteconsistsoffivebasictools:
SQLServerRelationalDatabase—usedtocreaterelationaldatabase
AnalysisServices—usedtocreatemultidimensionalmodel(measures,dimensionsandschema)
DataTransformationServices(DTS)—usedtoextract,transformandloaddatafromsource(s)tothe
datawarehouseorschema
ReportingServices—usedtobuildandmanageenterprisereportingusingtherelationalor
multidimensionalsources
DataMining—usedtoextractinformationbasedonpredeterminedalgorithms.
Figure1 .AnalysisServices
Architecture:Thefigureshowsthe
relationshipofthevarious
technologytiersinvolvedinAnalysis
Services.
ThisremainderofthisarticlefocusesonmultidimensionalmodelingusingAnalysisServicesandbriefly
touchesuponDTS'srole. Figure1 showstheAnalysisServicesarchitecture.
ElementsofMultidimensionalModels
TofullyleveragetheSQLServer2005BIWorkbenchplatform,onemustfirstunderstandthebasicelementsofmultidimensionalmodeling.The
basicelementsofamultidimensionalcubeare: measures , dimensions ,and schema .
Measures
Measuresarethekeyperformanceindicatorsthatyouwanttoevaluate.Todeterminewhichofthenumbersinthedatamightbemeasures,aruleof
thumbis:Ifanumbermakessensewhenitisaggregated,thenitisameasure.Forexample,itmakessensetoaggregatedailyvolumetomonth,
quarterandyear.Ontheotherhand,aggregatingzipcodesortelephonenumberswouldnotmakesense;therefore,zipcodesandtelephone
numbersarenotmeasures.Typicalmeasuresincludevolume,sales,andcost.
Dimensions
30/07/2004 11:53:46
384138919.004.png 384138919.005.png 384138919.006.png
An Introduction to OLAP in SQL Server 2005
Page 2
Dimensionsarethecategoriesofdataanalysis.Theruleofthumbis:Whenareportisrequested"by"something,thatsomethingisusuallya
dimension.Forexample,inarevenuereportbymonthbysalesregion,thetwodimensionsneededaretimeandsalesregion.Forthisreason,OLAP
analystsoftennicknamedimensionsthe"bys."Typicaldimensionsincludeproduct,time,andregion.
Dimensionsarearrangedinhierarchicallevels,withuniquepositionswithineachlevel.Forexample,atimedimensionmayhavefourlevels,suchas
Year,Quarter,Month,andDay.Orthedimensionmighthaveonlythreelevels,forexample,Year,Week,andDay.Thevalueswithinthelevelsare
called members .Forexample,theyears2002and2003aremembersofthelevelYearintheTimedimension.
Webelieveasabestpractice,acubeshouldhavenomorethantwelvedimensions.Acube
withmorethantwelvedimensionsbecomesdifficulttounderstandandbrowse.Toomany
dimensionscancauseconfusionamongendusersandhavingtoomanydimensionsand
aggregationscanalsoleadto"dataexplosion."Asthenumberofdimensionsandlevels
increase,theamountofdatagrowsexponentially.Asmentionedearlier,anOLAP
applicationistypicallyusedtomanipulatelargevolumesofdata.Tooptimizeresponse
time,AnalysisServicesusuallypre-aggregateamultidimensionalschema.
Adimensioncanbethoughtofasatreestructure.ManyOLAPtoolspresentitinatree
control(see Figure2 ).Thisfamiliarsoftwarecontrolmakesusingdimensionseasierasit
allowsdimensionmembersandtheirrelationshipstobeviewedsimultaneously.This
simpleinterfacemakesusingthedimensionsextremelyuser-friendlyandallowsuserto
viewdataofdifferentlevelssimultaneously.
Schema
Thedimensionsandmeasuresarephysicallyrepresentedbyastarschema.Themost
basicstarschema arrangesthedimensiontablesaroundacentralfacttablethatcontains
themeasures(see Figure3 ).
Figure2 .ATree-StructuredMultidimensionalschema:
ThefigureshowsExcelPivotTableinterfaceconsistingof
treeviewforamultidimensionalstructure.
Afacttablecontainsacolumnforeachmeasureaswellasacolumnforeachdimension.Eachdimension
columnhasaforeign-keyrelationshiptotherelateddimensiontable,andthedimensioncolumnstaken
togetherarethekeytothefacttable.
Afterdeterminingthemeasures,dimensions,andschemausingtheBIWorkbench,thereisonemorestep—
youmustdecidewherethedataaggregationistobestored.Historically,therewerethreebasicstorage
options:MultidimensionalOLAP(MOLAP),RelationalOLAP(ROLAP),orHybridOLAP(HOLAP).SQLServer
2005's introductionof whatMicrosoftcallsthe UnifiedDimensionalModel,whichleveragesthebestof
relationalandOLAPcubetechnologies,allowsthedesignermanymorestorageoptionsandunlikeSQLServer
2000,allows combiningthemin thesamesolution.
DTS
Microsoft'sDataTransformationServices(DTS)isperhapsthemostcriticaltoolinanOLAPproject.DTSis
usedtopulldatafromvarioussourcesintothestarschema.Thedatawarehousewill,inturn,feedtheAnalysis
Servicesdatabase.Moreoftenthatnot,youmusttransformdatafromthesource(forexample,youmayhave
toconvertcurrencyvalues,balancecalculations,andthelike)andremapit.Microsofthasestimatedthatin
mostcases,organizationsspendeightypercentoftheirdatawarehousingontheextract,transform,andload(ETL)phase.
Figure3 .SimpleStarSchema:The
figureshowsabasicstar
schema withthedimension
tablesarrangedaroundacentral
facttablethatcontainsthe
measures.
VisualStudio2005hostsanewtool,BIWorkbench,whichisareplacementforDTSDesigner.ChiefamongtheimprovementsfoundinBI
Workbenchisitsseparationofcontrolflow(insertions,looping,sequencing,scripting,etc.)fromthedataflow(sourceidentification,aggregation,
charactermapping,anddataconversion)tasks(see Figure4 and Figure5 ).ThisseparationmakesDTSpackageseasiertoread,develop,and
maintain.BIWorkbenchisreasonenoughtolearnanduseVisualStudio2005.
Figure4 .ControlFlowTask
Diagram:Thefigureshowsatypical
controlflowandanassociatedtask
statement.
Figure5 .DataFlowDiagram:The
figureshowsatypicaldataflow
fromsourcetodestination.
BecauseDTShasbeencompletelyreworkedinSQLServer2005,currentSQL2000DTSuserwillneedtobrushuponDTS—andlearnafewnew
tricks.
WorkingwithAnalysisServices
Afteridentifyingthedimensionsandmeasuresyouwishtoanalyze,youcanuseAnalysisServicestoconstructanOLAPcube.
30/07/2004 11:53:46
384138919.007.png 384138919.001.png
An Introduction to OLAP in SQL Server 2005
Page 3
AnalysisServiceshasbuilt-inwizardsthatmaketheactualprocessofcreatingdimensionsfairlyeasy,especiallyifyou'realreadyfamiliarwithSQL
2000'sversion,althoughSQLServer2005'sversiondoesaddoneadditionalstep—youmustcreateaDataSourceViewtoimportyourdatabase
objects.
MDX
JustasyouuseSQLtoqueryrelationaldatabases,youuseMDXtoqueryamultidimensionalcube(see Figure
6 ).ForthosewhoareeagertointerrogatethecubewithoutlearningMDX,thereisanExcelPivotTableadd-in
thatprovidesadraganddropqueryinterface.ThisinterfacegeneratesMDXandqueriesthecubeonbehalfof
theuserandasaspecialbonustheresultsaredisplayedinExcel!
YouuseMDXusedtocreate"calculatedmeasures"thatwouldbetoocomplexorimpossibletodoinSQL.For
example,supposetheVPofSaleswantstoknowwhattheaveragesalespriceofeachproductis.
Unfortunately,averagesalespriceisnotameasureintheSalescube;however,StoreSalesandSalesCount
areavailable.BecauseyoucancalculateAverageSalesPricebydividingStoreSalesbySalesCount,youcan
calculate themeasure(ergothename"calculatedmeasure")byusingMDX.Here'stheMDXcode.
Figure6 .SQLvs.MDX:Thefigure
comparesdataextractionusing
SQLvs.MDX.
WITH
MEMBER Measures.[Average Sale Price] AS
'Measures.[Store Sales] /
Measures.[Sales Count]'
SELECT
{ Measures.[Average Sale Price] } ON COLUMNS,
{ Product.CHILDREN } ON ROWS
FROM Sales
Luckily,somethirdpartytoolsletuserscreatecalculatedmeasuresthatmayhavebeenintentionallyomittedfromtheoriginalcubedesign,suchas
commissionorbonuscalculations.
CubeBrowser
Aftercreatingthecube,youneedacubebrowsertoconnecttothecubeanddisplaythedata.Cubebrowsersusuallyprovideuser-friendlytree-
structureddimensionfiltersand/ordraganddropinterfacesthatallowenduserstointerrogatethecube.Youcansetuppre-definedqueries,orallow
adhocqueryingbylettinguserscombinethevariousmeasureswithdimensions.
Forexample,supposeyouwanttocreateareportthatshowsRevenuebySalesTerritorybyProduct.Becausedimensionsarehierarchical,youcan
obtainthedetailsofadimensionbydrillingdown.Thisusuallyinvolvesclickingonthedimension(forexample,clickingonSalesTerritorymayreveal
eachstore'slevelinthatdimension).
Dimensionscanhavemultiplelevels(suchasyear,quarter,andmonth).Userscanmixandmatchmemberswithinthesamedimension.
Furthermore,somecubebrowsersenabledeveloperstoexportcubebrowsersasaWebpartthattheycantheneasilyincludeinaportalsiteordigital
dashboard.
Therearethreebasictypesofcubebrowsers:
OfficeWebcomponentssuchastheExcelPivotTable
Third-partyapplicationssuchProClarity
Custom-builtapplications
SomeOLAPdevelopersfinddebuggingcubedesignandvalidatingdatausingpivottablesmucheasierthanperformingthesametasksusingthe
nativeAnalysisServicesscreen.
Tosumup,here'stheprocessinanutshell.
Determinetherequireddimensionsandmeasures.
UseDataTransformationServicestoextractdatafromyoursourcedatabases,transformingthedataasneeded,andloadingthefinished
dataintothecube.
UsetheBIWorkbench'sAnalysisServiceswizardstobuildthemeasures,dimensions,andschema.
Providecubebrowsersforyouruserssotheycanselectandviewreports.Ifnecessary,writeMDXqueriesoruseautomatedtools,suchas
ExcelPivotTablestoquerythecube.
Hopefully,thisprimerhaswhettedyourOLAPappetiteandgivenyoutheconfidencetostartcreatingOLAPcubesyourself.Agoodwaytogetstarted
istousethesampleFoodmartorAdventureWorksdatabasesthatshipwithSQLServer2005.
GailTieh isaProjectLeaderwithCitigateHudson'sPervasiveBusinessIntelligenceteam.Gail'sexpertiseincludesprocessimprovementthrough
theuseoftechnology.GailholdsanMBAinInformationSystemsandBAinEconomicsfromBaruchCollegeofCityUniversityofNewYork.She
currentlyservesontheBoardofNewYorkSoftwareIndustryAssociation(NYSIA)andisalsotheSpecialInterestGroupLeaderofNYSIA's
DatabaseProfessionalsCouncil.
30/07/2004 11:53:46
384138919.002.png 384138919.003.png
An Introduction to OLAP in SQL Server 2005
Page 4
DevXisadivisionofJupitermediaCorporation
©Copyright2004JupitermediaCorporation.AllRightsReserved. LegalNotices
30/07/2004 11:53:46
Zgłoś jeśli naruszono regulamin