`
universsky
  • 浏览: 92739 次
文章分类
社区版块
存档分类
最新评论

C++ LANGUAGE TUTORIALS : DATA STRUCTURES

 
阅读更多

Data Structures

We have already learned how groups of sequential data can be used in C++. But this is somewhat restrictive, since in many occasions what we want to store are not mere sequences of elements all of the same data type, but sets of different elements with different data types.

Data structures

A data structure is a group of data elements grouped together under one name. These data elements, known as members, can have different types and different lengths. Data structures are declared in C++ using the following syntax:

struct structure_name {
member_type1 member_name1;
member_type2 member_name2;
member_type3 member_name3;
.
.
} object_names;

where structure_name is a name for the structure type, object_name can be a set of valid identifiers for objects that have the type of this structure. Within braces { } there is a list with the data members, each one is specified with a type and a valid identifier as its name.

The first thing we have to know is that a data structure creates a new type: Once a data structure is declared, a new type with the identifier specified as structure_name is created and can be used in the rest of the program as if it was any other type. For example:

1
2
3
4
5
6
7
struct product {
  int weight;
  float price;
} ;

product apple;
product banana, melon;


We have first declared a structure type called product with two members: weight and price, each of a different fundamental type. We have then used this name of the structure type (product) to declare three objects of that type: apple, banana and melon as we would have done with any fundamental data type.

Once declared, product has become a new valid type name like the fundamental ones int, char or short and from that point on we are able to declare objects (variables) of this compound new type, like we have done with apple, banana and melon.

Right at the end of the struct declaration, and before the ending semicolon, we can use the optional field object_name to directly declare objects of the structure type. For example, we can also declare the structure objects apple, banana and melon at the moment we define the data structure type this way:

1
2
3
4
struct product {
  int weight;
  float price;
} apple, banana, melon;


It is important to clearly differentiate between what is the structure type name, and what is an object (variable) that has this structure type. We can instantiate many objects (i.e. variables, like apple, banana and melon) from a single structure type (product).

Once we have declared our three objects of a determined structure type (apple, banana and melon) we can operate directly with their members. To do that we use a dot (.) inserted between the object name and the member name. For example, we could operate with any of these elements as if they were standard variables of their respective types:

1
2
3
4
5
6
apple.weight
apple.price
banana.weight
banana.price
melon.weight
melon.price


Each one of these has the data type corresponding to the member they refer to: apple.weight, banana.weight and melon.weight are of type int, while apple.price, banana.price and melon.price are of type float.

Let's see a real example where you can see how a structure type can be used in the same way as fundamental types:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// example about structures
#include <iostream>
#include <string>
#include <sstream>
using namespace std;

struct movies_t {
  string title;
  int year;
} mine, yours;

void printmovie (movies_t movie);

int main ()
{
  string mystr;

  mine.title = "2001 A Space Odyssey";
  mine.year = 1968;

  cout << "Enter title: ";
  getline (cin,yours.title);
  cout << "Enter year: ";
  getline (cin,mystr);
  stringstream(mystr) >> yours.year;

  cout << "My favorite movie is:\n ";
  printmovie (mine);
  cout << "And yours is:\n ";
  printmovie (yours);
  return 0;
}

void printmovie (movies_t movie)
{
  cout << movie.title;
  cout << " (" << movie.year << ")\n";
}
Enter title: Alien
Enter year: 1979

My favorite movie is:
 2001 A Space Odyssey (1968)
And yours is:
 Alien (1979)


The example shows how we can use the members of an object as regular variables. For example, the member yours.year is a valid variable of type int, and mine.title is a valid variable of type string.

The objects mine and yours can also be treated as valid variables of type movies_t, for example we have passed them to the function printmovie as we would have done with regular variables. Therefore, one of the most important advantages of data structures is that we can either refer to their members individually or to the entire structure as a block with only one identifier.

Data structures are a feature that can be used to represent databases, especially if we consider the possibility of building arrays of them:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// array of structures
#include <iostream>
#include <string>
#include <sstream>
using namespace std;

#define N_MOVIES 3

struct movies_t {
  string title;
  int year;
} films [N_MOVIES];

void printmovie (movies_t movie);

int main ()
{
  string mystr;
  int n;

  for (n=0; n<N_MOVIES; n++)
  {
    cout << "Enter title: ";
    getline (cin,films[n].title);
    cout << "Enter year: ";
    getline (cin,mystr);
    stringstream(mystr) >> films[n].year;
  }

  cout << "\nYou have entered these movies:\n";
  for (n=0; n<N_MOVIES; n++)
    printmovie (films[n]);
  return 0;
}

void printmovie (movies_t movie)
{
  cout << movie.title;
  cout << " (" << movie.year << ")\n";
}
Enter title: Blade Runner
Enter year: 1982
Enter title: Matrix
Enter year: 1999
Enter title: Taxi Driver
Enter year: 1976
 
You have entered these movies:
Blade Runner (1982)
Matrix (1999)
Taxi Driver (1976)


Pointers to structures

Like any other type, structures can be pointed by its own type of pointers:

1
2
3
4
5
6
7
struct movies_t {
  string title;
  int year;
};

movies_t amovie;
movies_t * pmovie;


Here amovie is an object of structure type movies_t, and pmovie is a pointer to point to objects of structure type movies_t. So, the following code would also be valid:

pmovie = &amovie;


The value of the pointer pmovie would be assigned to a reference to the object amovie (its memory address).

We will now go with another example that includes pointers, which will serve to introduce a new operator: the arrow operator (->):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// pointers to structures
#include <iostream>
#include <string>
#include <sstream>
using namespace std;

struct movies_t {
  string title;
  int year;
};

int main ()
{
  string mystr;

  movies_t amovie;
  movies_t * pmovie;
  pmovie = &amovie;

  cout << "Enter title: ";
  getline (cin, pmovie->title);
  cout << "Enter year: ";
  getline (cin, mystr);
  (stringstream) mystr >> pmovie->year;

  cout << "\nYou have entered:\n";
  cout << pmovie->title;
  cout << " (" << pmovie->year << ")\n";

  return 0;
}
Enter title: Invasion of the body snatchers
Enter year: 1978
 
You have entered:
Invasion of the body snatchers (1978)


The previous code includes an important introduction: the arrow operator (->). This is a dereference operator that is used exclusively with pointers to objects with members. This operator serves to access a member of an object to which we have a reference. In the example we used:

pmovie->title


Which is for all purposes equivalent to:

(*pmovie).title


Both expressions pmovie->title and (*pmovie).title are valid and both mean that we are evaluating the member title of the data structure pointed by a pointer called pmovie. It must be clearly differentiated from:

*pmovie.title


which is equivalent to:

*(pmovie.title)


And that would access the value pointed by a hypothetical pointer member called title of the structure object pmovie (which in this case would not be a pointer). The following panel summarizes possible combinations of pointers and structure members:

Expression What is evaluated Equivalent
a.b Member b of object a
a->b Member b of object pointed by a (*a).b
*a.b Value pointed by member b of object a *(a.b)

Nesting structures


Structures can also be nested so that a valid element of a structure can also be in its turn another structure.

1
2
3
4
5
6
7
8
9
10
11
12
struct movies_t {
  string title;
  int year;
};

struct friends_t {
  string name;
  string email;
  movies_t favorite_movie;
  } charlie, maria;

friends_t * pfriends = &charlie;


After the previous declaration we could use any of the following expressions:

1
2
3
4
charlie.name
maria.favorite_movie.title
charlie.favorite_movie.year
pfriends->favorite_movie.year


(where, by the way, the last two expressions refer to the same member).

Defined data types (typedef)

C++ allows the definition of our own types based on other existing data types. We can do this using the keyword typedef, whose format is:

typedef existing_type new_type_name ;

where existing_type is a C++ fundamental or compound type and new_type_name is the name for the new type we are defining. For example:

1
2
3
4
typedef char C;
typedef unsigned int WORD;
typedef char * pChar;
typedef char field [50]; 



In this case we have defined four data types: C, WORD, pChar and field as char, unsigned int, char* and char[50] respectively, that we could perfectly use in declarations later as any other valid type:

1
2
3
4
C mychar, anotherchar, *ptc1;
WORD myword;
pChar ptc2;
field name; 



typedef does not create different types. It only creates synonyms of existing types. That means that the type of myword can be considered to be either WORD or unsigned int, since both are in fact the same type.

typedef can be useful to define an alias for a type that is frequently used within a program. It is also useful to define types when it is possible that we will need to change the type in later versions of our program, or if a type you want to use has a name that is too long or confusing.

Unions

Unions allow one same portion of memory to be accessed as different data types, since all of them are in fact the same location in memory. Its declaration and use is similar to the one of structures but its functionality is totally different:

union union_name {
  member_type1 member_name1;
  member_type2 member_name2;
  member_type3 member_name3;
  .
  .
} object_names;


All the elements of the union declaration occupy the same physical space in memory. Its size is the one of the greatest element of the declaration. For example:

1
2
3
4
5
union mytypes_t {
  char c;
  int i;
  float f;
  } mytypes;



defines three elements:

1
2
3
mytypes.c
mytypes.i
mytypes.f



each one with a different data type. Since all of them are referring to the same location in memory, the modification of one of the elements will affect the value of all of them. We cannot store different values in them independent of each other.

One of the uses a union may have is to unite an elementary type with an array or structures of smaller elements. For example:

1
2
3
4
5
6
7
8
union mix_t {
  long l;
  struct {
    short hi;
    short lo;
    } s;
  char c[4];
} mix;



defines three names that allow us to access the same group of 4 bytes: mix.l, mix.s and mix.c and which we can use according to how we want to access these bytes, as if they were a single long-type data, as if they were two short elements or as an array of char elements, respectively. I have mixed types, arrays and structures in the union so that you can see the different ways that we can access the data. For a little-endian system (most PC platforms), this union could be represented as:


The exact alignment and order of the members of a union in memory is platform dependent. Therefore be aware of possible portability issues with this type of use.

Anonymous unions

In C++ we have the option to declare anonymous unions. If we declare a union without any name, the union will be anonymous and we will be able to access its members directly by their member names. For example, look at the difference between these two structure declarations:

structure with regular union structure with anonymous union
struct {
  char title[50];
  char author[50];
  union {
    float dollars;
    int yen;
  } price;
} book;
struct {
  char title[50];
  char author[50];
  union {
    float dollars;
    int yen;
  };
} book;

The only difference between the two pieces of code is that in the first one we have given a name to the union (price) and in the second one we have not. The difference is seen when we access the members dollars and yen of an object of this type. For an object of the first type, it would be:

1
2
book.price.dollars
book.price.yen



whereas for an object of the second type, it would be:

1
2
book.dollars
book.yen



Once again I remind you that because it is a union and not a struct, the members dollars and yen occupy the same physical space in the memory so they cannot be used to store two different values simultaneously. You can set a value for price in dollars or in yen, but not in both.

Enumerations (enum)

Enumerations create new data types to contain something different that is not limited to the values fundamental data types may take. Its form is the following:

enum enumeration_name {
  value1,
  value2,
  value3,
  .
  .
} object_names;


For example, we could create a new type of variable called colors_t to store colors with the following declaration:

enum colors_t {black, blue, green, cyan, red, purple, yellow, white}; 



Notice that we do not include any fundamental data type in the declaration. To say it somehow, we have created a whole new data type from scratch without basing it on any other existing type. The possible values that variables of this new type color_t may take are the new constant values included within braces. For example, once the colors_t enumeration is declared the following expressions will be valid:

1
2
3
4
colors_t mycolor;
 
mycolor = blue;
if (mycolor == green) mycolor = red; 



Enumerations are type compatible with numeric variables, so their constants are always assigned an integer numerical value internally. If it is not specified, the integer value equivalent to the first possible value is equivalent to 0 and the following ones follow a +1 progression. Thus, in our data type colors_t that we have defined above, black would be equivalent to 0, blue would be equivalent to 1, green to 2, and so on.

We can explicitly specify an integer value for any of the constant values that our enumerated type can take. If the constant value that follows it is not given an integer value, it is automatically assumed the same value as the previous one plus one. For example:

1
2
3
enum months_t { january=1, february, march, april,
                may, june, july, august,
                september, october, november, december} y2k;



In this case, variable y2k of enumerated type months_t can contain any of the 12 possible values that go from january to december and that are equivalent to values between 1 and 12 (not between 0 and 11, since we have made january equal to 1).

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics