Announcement

Collapse
No announcement yet.

Normalisation

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Normalisation

    Right, I've been through countless tutorials on how to normalise a database. It's something I have successfully done in the past, but quite a long time ago so I've forgotten a lot of it. Here is my UNF (normalised form) of a soccer match database:

    UNF

    Season
    Tier
    Division
    Fixture
    Result
    Date
    Status
    Time
    StadiumName
    Attendance
    RefereeName
    First Half Goals
    Second Half Goals

    I'm struggling with 1NF. I'm having difficulty understanding the meaning of 'repeating data'. The tutorials I've been through say you need to separate the repeating and non-repeating data. Okay, but I can look at all of the above and say it's repeating, because... all of the data under each of those headings repeats somewhere. I'm confused about it.

    I know that things from the above would have to be broken down during the normalisation process, such as 'Fixture', 'Result', 'RefereeName' and 'First Half Goals' & 'Second Half Goals', but I would appreciate any help for getting me started on this.

    Thanks.

  • #2
    The main trick of creating a datamodal is to prevent typing data more than once. So if you have a competition of clubs and you want to store the trainer and players of a team for every match then you will see that you are repeatingly typing over all data which is already there. Creating a good data-modal is not easy, but the steps are not too difficult if you take a practical attitude.
    You don't want to type every player, trainer and refereename over-and-over again. Suppose you have a typing error in referee and you want to make a list of all games for that referee you';ll be in trouble.

    So as a first step, just to get you started (didn't think it all through):

    tables for:

    referee: number and name
    player: number, and name (and other data: age etc).
    club: number and name (and other data: main stadium, city etc)
    team: clubnumber, season, playernumber
    match: matchnumber, season, clubnumberhome, clubnumberguest, refereenumber, date, result
    matchplayer: matchnumber, clubnumber, playernumber, fromminute, tominute (for changes)
    matchgoal: matchnumber, clubnumber, playernumber, minute
    Albert Drent
    aducom software netherlands
    scriptcase partner, reseller, support and (turn-key) development
    www.scriptcase.eu / www.scriptcase.nl

    Comment


    • #3
      Originally posted by aducom View Post
      The main trick of creating a datamodal is to prevent typing data more than once. So if you have a competition of clubs and you want to store the trainer and players of a team for every match then you will see that you are repeatingly typing over all data which is already there. Creating a good data-modal is not easy, but the steps are not too difficult if you take a practical attitude.
      You don't want to type every player, trainer and refereename over-and-over again. Suppose you have a typing error in referee and you want to make a list of all games for that referee you';ll be in trouble.

      So as a first step, just to get you started (didn't think it all through):

      tables for:

      referee: number and name
      player: number, and name (and other data: age etc).
      club: number and name (and other data: main stadium, city etc)
      team: clubnumber, season, playernumber
      match: matchnumber, season, clubnumberhome, clubnumberguest, refereenumber, date, result
      matchplayer: matchnumber, clubnumber, playernumber, fromminute, tominute (for changes)
      matchgoal: matchnumber, clubnumber, playernumber, minute
      Thank you for your response. I get what the benefits of a relational database is, and I roughly know what it's going to look like in terms of the final table structures, but I need to do it for a project I'm developing for a client. I need to demonstrate the steps I've taken to normalise the data (using 1NF, 2NF and 3NF). But the problem is, I'm struggling with 1NF. I don't quite get how to separate repeating and non-repeating data from the data I have. To me, it all looks repeating — but I know it's not. What is the repeating and non-repeating data from the ones I've given (in my previous post)?

      Comment


      • #4
        1st NV means that you have to separate datafields containing repeating data into separate fields. So if you have a field 'salary' which contains textdata like : 'scale 10, amount 1000', then you need to split this into separate fields. A field must contain only one single value. So if you have a field 'telephone' which can contain more than one number, you must make separate fields for this (or make a separate reference table). I suspect this i.e. for attendance (what does this field contain) and goals. (Do you want to keep the players who scored?). Hope this helps. The first NV means that each field is 'atomic' and can contain only one value (per record)
        Albert Drent
        aducom software netherlands
        scriptcase partner, reseller, support and (turn-key) development
        www.scriptcase.eu / www.scriptcase.nl

        Comment


        • #5
          Originally posted by aducom View Post
          1st NV means that you have to separate datafields containing repeating data into separate fields. So if you have a field 'salary' which contains textdata like : 'scale 10, amount 1000', then you need to split this into separate fields. A field must contain only one single value. So if you have a field 'telephone' which can contain more than one number, you must make separate fields for this (or make a separate reference table). I suspect this i.e. for attendance (what does this field contain) and goals. (Do you want to keep the players who scored?). Hope this helps. The first NV means that each field is 'atomic' and can contain only one value (per record)
          Okay, so from my data, I would say (in brackets below):

          UNF

          Season
          Tier (this is a number that would define what level the division is. Is this necessary?)
          Division
          Fixture (home team v away team these two can be separate fields)
          Result (home goals v away goals these two can be separate fields)
          Date
          Status
          Time
          StadiumName
          Attendance
          RefereeName (first and last name these could be separate fields)
          First Half Goals (home goals & away first half goals these could be separate fields)
          Second Half Goals (home goals & away second half goals these could be separate fields)

          So yeah, I could do that. But is that my 'repeating data' that I've put alongside brackets?

          Comment


          • #6
            I think so. Do you want to store the maker(s) of the goals? Then you should add a field for goalmaker. Later on (other NV) you will recognize this as a repeating field related to a master (game). Then you will create a separate table for this. For now you only define the separated fields needed to create the datamodal.
            Albert Drent
            aducom software netherlands
            scriptcase partner, reseller, support and (turn-key) development
            www.scriptcase.eu / www.scriptcase.nl

            Comment


            • #7
              Originally posted by aducom View Post
              I think so. Do you want to store the maker(s) of the goals? Then you should add a field for goalmaker. Later on (other NV) you will recognize this as a repeating field related to a master (game). Then you will create a separate table for this. For now you only define the separated fields needed to create the datamodal.
              No, I only want to store the data headings I have listed. It's obvious that there will be several tables by the end of the normalisation process:

              Match
              Team
              Referee
              Stadium
              Division

              Okay, so for my 1NF, I've addressed the issue of having multiple values in each field by breaking down some of the headings. Problem is, the table split up into two (meant to be repeating and non-repeating data). However, I'm sure I've done this wrong. In the examples I've seen, they conveniently split up the table because they have appropriate data to fit into two tables. Looking at my set, I don't know how to split it into two without splitting up what is necessary in each table.

              Season
              Division
              Status
              Date
              Time
              StadiumName
              Attendance

              HomeTeam
              AwayTeam
              HomeGoals
              AwayGoals
              RefereeFirstName
              RefereeLastName
              HomeFirstHalfGoals
              AwayFirstHalfGoals
              HomeSecondHalfGoals
              AwaySecondHalfGoals

              Comment


              • #8
                As soon as you go into 2nd and 3rd NV you will see that you will need (primary and foreign) keys to identify datasets. You'll have to split these data into groups by recognizing the repeating groups and assign keys to them. You have to identify the n:m relationships and split these into 1:n relationships. But for now, I don't see what's wrong with your list. Be aware that samples on the web always have a kind of 'ideal' situation in them.
                Albert Drent
                aducom software netherlands
                scriptcase partner, reseller, support and (turn-key) development
                www.scriptcase.eu / www.scriptcase.nl

                Comment


                • #9
                  Originally posted by aducom View Post
                  As soon as you go into 2nd and 3rd NV you will see that you will need (primary and foreign) keys to identify datasets. You'll have to split these data into groups by recognizing the repeating groups and assign keys to them. You have to identify the n:m relationships and split these into 1:n relationships. But for now, I don't see what's wrong with your list. Be aware that samples on the web always have a kind of 'ideal' situation in them.
                  After going to another forum and getting a second opinion, it seems like I should do away with the first half and second half goals fields, because they clash with the homeGoals and AwayGoals. It's repeating data. I know you answered my question in the other thread about creating automatic columns that add up the data in fields. So it is possible in that respect, but I'll leave it out for now to avoid confusion.

                  Well this is what someone else said for 1NF:

                  SeasonID
                  DivisionID
                  HomeTeamID
                  AwayTeamID
                  StadiumID
                  RefereeID
                  StatusID

                  HomeGoals
                  AwayGoals
                  Attendance
                  Date
                  Time

                  Problem now is how do I assign primary keys to that data? I'm guessing all the ones in the top table are primary keys. And then in 2NF would I split it all up in its individual tables with a table for each of the IDs above? Or would that come in 3NF?

                  Comment


                  • #10
                    No, now you have to search for the repeating groups. Then you will find the keys (in your case I expect that statusID would not be found in the 2nd NV). The final step is to look into the determinated tables to look for dependencies. Normally you would enter 'status'. But if you want to be sure that everybody is using the same status and avoid typing errors this value has a dependency. So you create a table for this. the primary keys need to be decided upon your structure. If you have a n-m relationship you need a table to convert this to 1-n relation ship (2nd nv).
                    Albert Drent
                    aducom software netherlands
                    scriptcase partner, reseller, support and (turn-key) development
                    www.scriptcase.eu / www.scriptcase.nl

                    Comment

                    Working...
                    X