An Introduction To Olap In Sql Server 2005.pdf
(
49 KB
)
Pobierz
An Introduction to OLAP in SQL Server 2005
An Introduction to OLAP in SQL Server 2005
Page 1
http://www.devx.com
Printed from
http://www.devx.com/dbzone/Article/21410
AnIntroductiontoOLAPinSQLServer2005
GetapreviewoftheupcomingSQLServer2005'sBusinessIntelligencesuiteandfindoutaboutthemajorOLAP
componentsofAnalysisServices.
byGailTieh
fyou'readatabasedeveloper,you'venodoubtheardofOn-LineAnalyticalProcessing(OLAP)andtheadvantagesofanalysisusingmulti-
dimensional,pre-aggregateddata.Maybeyou'veeventhoughtaboutcreatingyourownmultidimensionalcubestogiveyourenduserstruead
hoccapabilities,includingthecreationofcalculatedmeasures/KPIs.Ifyou'verelegatedthattasktothebackburnerbecauseitwastoocomplex,you'll
behappytoknowthatSQL2005hasmadetheprocesseasier.
ThisarticlediscussesthemajorOLAPcomponentsofAnalysisServices,allofwhichcanbeimplementedbyevenafirst-timecubebuilder.Afollow
uparticlebyMarkFrawleywillexaminethedifferencesbetweenAnalysisServicesinSQL2000andSQL2005.
WhyUseOLAP?
OLAPisusefulbecauseitprovidesfastandinteractiveaccesstoaggregateddataandtheabilitytodrilldowntodetail.OLAPletsusersviewand
interrogatelargevolumesofdata(oftenmillionsofrows)bypre-aggregatingtheinformation.Itputsthedataneededtomakestrategicdecisions
directlyintothehandsofthedecisionmakers,notonlythroughpre-definedqueriesandreports,butalsobecauseitgivesenduserstheabilityto
performtheirownadhocqueries,minimizingusers'dependenceondatabasedevelopers.
What'stheSecret?
OLAPleveragesexistingdatafromarelationalschemaordatawarehouse(datasource)byplacingkeyperformanceindicators(measures)into
context(dimensions).Onceprocessedintoamultidimensionaldatabase(cube),allofthemeasuresarepre-aggregated,whichmakesdataretrieval
significantlyfaster.Theprocessedcubecanthenbemadeavailabletobusinessuserswhocanbrowsethedatausingavarietyoftools,makingad
hocanalysisaninteractiveandanalyticalprocessratherthanadevelopmenteffort.SQLServer2005'sBIWorkbenchsubstantiallyimprovesupon
SQLServer2000'sBIcapability.
SQLServer2005BIWorkbenchPlatform
TheSQLServer2005BIWorkbenchsuiteconsistsoffivebasictools:
SQLServerRelationalDatabase—usedtocreaterelationaldatabase
AnalysisServices—usedtocreatemultidimensionalmodel(measures,dimensionsandschema)
DataTransformationServices(DTS)—usedtoextract,transformandloaddatafromsource(s)tothe
datawarehouseorschema
ReportingServices—usedtobuildandmanageenterprisereportingusingtherelationalor
multidimensionalsources
DataMining—usedtoextractinformationbasedonpredeterminedalgorithms.
Figure1
.AnalysisServices
Architecture:Thefigureshowsthe
relationshipofthevarious
technologytiersinvolvedinAnalysis
Services.
ThisremainderofthisarticlefocusesonmultidimensionalmodelingusingAnalysisServicesandbriefly
touchesuponDTS'srole.
Figure1
showstheAnalysisServicesarchitecture.
ElementsofMultidimensionalModels
TofullyleveragetheSQLServer2005BIWorkbenchplatform,onemustfirstunderstandthebasicelementsofmultidimensionalmodeling.The
basicelementsofamultidimensionalcubeare:
measures
,
dimensions
,and
schema
.
Measures
Measuresarethekeyperformanceindicatorsthatyouwanttoevaluate.Todeterminewhichofthenumbersinthedatamightbemeasures,aruleof
thumbis:Ifanumbermakessensewhenitisaggregated,thenitisameasure.Forexample,itmakessensetoaggregatedailyvolumetomonth,
quarterandyear.Ontheotherhand,aggregatingzipcodesortelephonenumberswouldnotmakesense;therefore,zipcodesandtelephone
numbersarenotmeasures.Typicalmeasuresincludevolume,sales,andcost.
Dimensions
http://www.devx.com/dbzone/Article/21410/1954?pf=true
30/07/2004 11:53:46
An Introduction to OLAP in SQL Server 2005
Page 2
Dimensionsarethecategoriesofdataanalysis.Theruleofthumbis:Whenareportisrequested"by"something,thatsomethingisusuallya
dimension.Forexample,inarevenuereportbymonthbysalesregion,thetwodimensionsneededaretimeandsalesregion.Forthisreason,OLAP
analystsoftennicknamedimensionsthe"bys."Typicaldimensionsincludeproduct,time,andregion.
Dimensionsarearrangedinhierarchicallevels,withuniquepositionswithineachlevel.Forexample,atimedimensionmayhavefourlevels,suchas
Year,Quarter,Month,andDay.Orthedimensionmighthaveonlythreelevels,forexample,Year,Week,andDay.Thevalueswithinthelevelsare
called
members
.Forexample,theyears2002and2003aremembersofthelevelYearintheTimedimension.
Webelieveasabestpractice,acubeshouldhavenomorethantwelvedimensions.Acube
withmorethantwelvedimensionsbecomesdifficulttounderstandandbrowse.Toomany
dimensionscancauseconfusionamongendusersandhavingtoomanydimensionsand
aggregationscanalsoleadto"dataexplosion."Asthenumberofdimensionsandlevels
increase,theamountofdatagrowsexponentially.Asmentionedearlier,anOLAP
applicationistypicallyusedtomanipulatelargevolumesofdata.Tooptimizeresponse
time,AnalysisServicesusuallypre-aggregateamultidimensionalschema.
Adimensioncanbethoughtofasatreestructure.ManyOLAPtoolspresentitinatree
control(see
Figure2
).Thisfamiliarsoftwarecontrolmakesusingdimensionseasierasit
allowsdimensionmembersandtheirrelationshipstobeviewedsimultaneously.This
simpleinterfacemakesusingthedimensionsextremelyuser-friendlyandallowsuserto
viewdataofdifferentlevelssimultaneously.
Schema
Thedimensionsandmeasuresarephysicallyrepresentedbyastarschema.Themost
basicstarschema arrangesthedimensiontablesaroundacentralfacttablethatcontains
themeasures(see
Figure3
).
Figure2
.ATree-StructuredMultidimensionalschema:
ThefigureshowsExcelPivotTableinterfaceconsistingof
treeviewforamultidimensionalstructure.
Afacttablecontainsacolumnforeachmeasureaswellasacolumnforeachdimension.Eachdimension
columnhasaforeign-keyrelationshiptotherelateddimensiontable,andthedimensioncolumnstaken
togetherarethekeytothefacttable.
Afterdeterminingthemeasures,dimensions,andschemausingtheBIWorkbench,thereisonemorestep—
youmustdecidewherethedataaggregationistobestored.Historically,therewerethreebasicstorage
options:MultidimensionalOLAP(MOLAP),RelationalOLAP(ROLAP),orHybridOLAP(HOLAP).SQLServer
2005's introductionof whatMicrosoftcallsthe UnifiedDimensionalModel,whichleveragesthebestof
relationalandOLAPcubetechnologies,allowsthedesignermanymorestorageoptionsandunlikeSQLServer
2000,allows combiningthemin thesamesolution.
DTS
Microsoft'sDataTransformationServices(DTS)isperhapsthemostcriticaltoolinanOLAPproject.DTSis
usedtopulldatafromvarioussourcesintothestarschema.Thedatawarehousewill,inturn,feedtheAnalysis
Servicesdatabase.Moreoftenthatnot,youmusttransformdatafromthesource(forexample,youmayhave
toconvertcurrencyvalues,balancecalculations,andthelike)andremapit.Microsofthasestimatedthatin
mostcases,organizationsspendeightypercentoftheirdatawarehousingontheextract,transform,andload(ETL)phase.
Figure3
.SimpleStarSchema:The
figureshowsabasicstar
schema withthedimension
tablesarrangedaroundacentral
facttablethatcontainsthe
measures.
VisualStudio2005hostsanewtool,BIWorkbench,whichisareplacementforDTSDesigner.ChiefamongtheimprovementsfoundinBI
Workbenchisitsseparationofcontrolflow(insertions,looping,sequencing,scripting,etc.)fromthedataflow(sourceidentification,aggregation,
charactermapping,anddataconversion)tasks(see
Figure4
and
Figure5
).ThisseparationmakesDTSpackageseasiertoread,develop,and
maintain.BIWorkbenchisreasonenoughtolearnanduseVisualStudio2005.
Figure4
.ControlFlowTask
Diagram:Thefigureshowsatypical
controlflowandanassociatedtask
statement.
Figure5
.DataFlowDiagram:The
figureshowsatypicaldataflow
fromsourcetodestination.
BecauseDTShasbeencompletelyreworkedinSQLServer2005,currentSQL2000DTSuserwillneedtobrushuponDTS—andlearnafewnew
tricks.
WorkingwithAnalysisServices
Afteridentifyingthedimensionsandmeasuresyouwishtoanalyze,youcanuseAnalysisServicestoconstructanOLAPcube.
http://www.devx.com/dbzone/Article/21410/1954?pf=true
30/07/2004 11:53:46
An Introduction to OLAP in SQL Server 2005
Page 3
AnalysisServiceshasbuilt-inwizardsthatmaketheactualprocessofcreatingdimensionsfairlyeasy,especiallyifyou'realreadyfamiliarwithSQL
2000'sversion,althoughSQLServer2005'sversiondoesaddoneadditionalstep—youmustcreateaDataSourceViewtoimportyourdatabase
objects.
MDX
JustasyouuseSQLtoqueryrelationaldatabases,youuseMDXtoqueryamultidimensionalcube(see
Figure
6
).ForthosewhoareeagertointerrogatethecubewithoutlearningMDX,thereisanExcelPivotTableadd-in
thatprovidesadraganddropqueryinterface.ThisinterfacegeneratesMDXandqueriesthecubeonbehalfof
theuserandasaspecialbonustheresultsaredisplayedinExcel!
YouuseMDXusedtocreate"calculatedmeasures"thatwouldbetoocomplexorimpossibletodoinSQL.For
example,supposetheVPofSaleswantstoknowwhattheaveragesalespriceofeachproductis.
Unfortunately,averagesalespriceisnotameasureintheSalescube;however,StoreSalesandSalesCount
areavailable.BecauseyoucancalculateAverageSalesPricebydividingStoreSalesbySalesCount,youcan
calculate
themeasure(ergothename"calculatedmeasure")byusingMDX.Here'stheMDXcode.
Figure6
.SQLvs.MDX:Thefigure
comparesdataextractionusing
SQLvs.MDX.
WITH
MEMBER Measures.[Average Sale Price] AS
'Measures.[Store Sales] /
Measures.[Sales Count]'
SELECT
{ Measures.[Average Sale Price] } ON COLUMNS,
{ Product.CHILDREN } ON ROWS
FROM Sales
Luckily,somethirdpartytoolsletuserscreatecalculatedmeasuresthatmayhavebeenintentionallyomittedfromtheoriginalcubedesign,suchas
commissionorbonuscalculations.
CubeBrowser
Aftercreatingthecube,youneedacubebrowsertoconnecttothecubeanddisplaythedata.Cubebrowsersusuallyprovideuser-friendlytree-
structureddimensionfiltersand/ordraganddropinterfacesthatallowenduserstointerrogatethecube.Youcansetuppre-definedqueries,orallow
adhocqueryingbylettinguserscombinethevariousmeasureswithdimensions.
Forexample,supposeyouwanttocreateareportthatshowsRevenuebySalesTerritorybyProduct.Becausedimensionsarehierarchical,youcan
obtainthedetailsofadimensionbydrillingdown.Thisusuallyinvolvesclickingonthedimension(forexample,clickingonSalesTerritorymayreveal
eachstore'slevelinthatdimension).
Dimensionscanhavemultiplelevels(suchasyear,quarter,andmonth).Userscanmixandmatchmemberswithinthesamedimension.
Furthermore,somecubebrowsersenabledeveloperstoexportcubebrowsersasaWebpartthattheycantheneasilyincludeinaportalsiteordigital
dashboard.
Therearethreebasictypesofcubebrowsers:
OfficeWebcomponentssuchastheExcelPivotTable
Third-partyapplicationssuchProClarity
Custom-builtapplications
SomeOLAPdevelopersfinddebuggingcubedesignandvalidatingdatausingpivottablesmucheasierthanperformingthesametasksusingthe
nativeAnalysisServicesscreen.
Tosumup,here'stheprocessinanutshell.
Determinetherequireddimensionsandmeasures.
UseDataTransformationServicestoextractdatafromyoursourcedatabases,transformingthedataasneeded,andloadingthefinished
dataintothecube.
UsetheBIWorkbench'sAnalysisServiceswizardstobuildthemeasures,dimensions,andschema.
Providecubebrowsersforyouruserssotheycanselectandviewreports.Ifnecessary,writeMDXqueriesoruseautomatedtools,suchas
ExcelPivotTablestoquerythecube.
Hopefully,thisprimerhaswhettedyourOLAPappetiteandgivenyoutheconfidencetostartcreatingOLAPcubesyourself.Agoodwaytogetstarted
istousethesampleFoodmartorAdventureWorksdatabasesthatshipwithSQLServer2005.
GailTieh
isaProjectLeaderwithCitigateHudson'sPervasiveBusinessIntelligenceteam.Gail'sexpertiseincludesprocessimprovementthrough
theuseoftechnology.GailholdsanMBAinInformationSystemsandBAinEconomicsfromBaruchCollegeofCityUniversityofNewYork.She
currentlyservesontheBoardofNewYorkSoftwareIndustryAssociation(NYSIA)andisalsotheSpecialInterestGroupLeaderofNYSIA's
DatabaseProfessionalsCouncil.
http://www.devx.com/dbzone/Article/21410/1954?pf=true
30/07/2004 11:53:46
An Introduction to OLAP in SQL Server 2005
Page 4
DevXisadivisionofJupitermediaCorporation
©Copyright2004JupitermediaCorporation.AllRightsReserved.
LegalNotices
http://www.devx.com/dbzone/Article/21410/1954?pf=true
30/07/2004 11:53:46
Plik z chomika:
dragon8227
Inne pliki z tego folderu:
warehouses.pdf
(174 KB)
SQL_Sams_Teach_Yourself_MySQL_in_21_Days_--_SAMS.pdf
(2746 KB)
SQL.A.BEGINNER.pdf
(3890 KB)
Projektowanie baz danych w Access.pdf
(1133 KB)
Professional Ado Net 2.0 (Programming With Sql Server 2005, Oracle And Mysql) -.pdf
(10867 KB)
Inne foldery tego chomika:
Algorytmy
C++
Excel VBA
Oracle
pack ADO
Zgłoś jeśli
naruszono regulamin