When does the compiler consider valid characters in a programming language

In the C programming language, the character set refers to a set of all the valid characters that we can use in the source program for forming words, expressions, and numbers.

The source character set contains all the characters that we want to use for the source program text. On the other hand, the execution character set consists of the set of those characters that we might use during the execution of any program. Thus, it is not a prerequisite that the execution character set and the source character set will be the same, or they will match altogether.

Ultimate Guide to Kickstart your GATE Exam Preparation
Download the e-book now

In this article, we will take a closer look at the Character Set in C according to the GATE Syllabus for CSE (Computer Science Engineering). Read ahead to know more.

Table of Contents

  • Use Of Character Set In C
  • Types Of Characters In C
    • Alphabets
    • Digits
    • Special Characters
    • White Spaces
  • Summary Of Special Characters In C
  • Purpose Of Character Set In C
    • Ascii Values
      • Control Characters
      • Printable Characters
      • Character Equivalence
  • Practice Problems On Character Set In C
  • FAQs

Use of Character Set in C

Just like we use a set of various words, numbers, statements, etc., in any language for communication, the C programming language also consists of a set of various different types of characters. These are known as the characters in C. They include digits, alphabets, special symbols, etc. The C language provides support for about 256 characters.

Every program that we draft for the C program consists of various statements. We use words for constructing these statements. Meanwhile, we use characters for constructing these statements. These characters must be from the C language character set. Let us look at the set of characters offered by the C language.

Types of Characters in C

The C programming language provides support for the following types of characters. In other words, these are the valid characters that we can use in the C language:

  • Digits
  • Alphabets
  • Main Characters

All of these serve a different set of purposes, and we use them in different contexts in the C language.

Alphabets

The C programming language provides support for all the alphabets that we use in the English language. Thus, in simpler words, a C program would easily support a total of 52 different characters- 26 uppercase and 26 lowercase.

Type of CharacterDescriptionCharactersLowercase Alphabetsa to za, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, zUppercase AlphabetsA to ZA, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z

Digits

The C programming language provides the support for all the digits that help in constructing/ supporting the numeric values or expressions in a program. These range from 0 to 9, and also help in defining an identifier. Thus, the C language supports a total of 10 digits for constructing the numeric values or expressions in any program.

Type of CharacterDescriptionCharactersDigits0 to 90, 1, 2, 3, 4, 5, 6, 7, 8, 9

Special Characters

We use some special characters in the C language for some special purposes, such as logical operations, mathematical operations, checking of conditions, backspaces, white spaces, etc.

We can also use these characters for defining the identifiers in a much better way. For instance, we use underscores for constructing a longer name for a variable, etc.

The C programming language provides support for the following types of special characters:

Type of CharacterExamplesSpecial Characters` ~ @ ! $ # ^ * % & ( ) [ ] { } < > + = _ – | / \ ; : ‘ “ , . ?

White Spaces

The white spaces in the C programming language contain the following:

  • Blank Spaces
  • Carriage Return
  • Tab
  • New Line

Summary of Special Characters in C

Here is a table that represents all the types of character sets that we can use in the C language:

Type of CharacterDescriptionCharactersLowercase Alphabetsa to za, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, zUppercase AlphabetsA to ZA, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, ZDigits0 to 90, 1, 2, 3, 4, 5, 6, 7, 8, 9Special Characters–` ~ @ ! $ # ^ * % & ( ) [ ] { } < > + = _ – | / \ ; : ‘ “ , . ?White Spaces–Blank Spaces, Carriage Return, Tab, New Line

Purpose of Character Set in C

The character sets help in defining the valid characters that we can use in the source program or can interpret during the running of the program. For the source text, we have the source character set, while we have the execution character set that we use during the execution of any program.

But we have various types of character sets. For instance, one of the character sets follows the basis of the ASCII character definitions, while the other set consists of various kanji characters (Japanese).

The type of character set we use will have no impact on the compiler- but we must know that every character has different, unique values. The C language treats every character with different integer values. Let us know a bit more about the ASCII characters.

ASCII Values

All the character sets used in the C language have their equivalent ASCII value. The ASCII value stands for American Standard Code for Information Interchange value. It consists of less than 256 characters, and we can represent these in 8 bits or even less. But we use a special type for accommodating and representing the larger sets of characters. These are called the wide-character type or wchat_t.

However, a majority of the ANSI-compatible compilers in C accept these ASCII characters for both the character sets- the source and the execution. Every ASCII character will correspond to a specific numeric value.

Here is a list of all the ASCII characters, along with their assigned numeric values.

Control Characters

ASCII ValueCharacterMeaning0NULLNull1SOHStart of Header2STXStart of Text3ETXEnd of Text4EOTEnd of Transaction5ENQEnquiry6ACKAcknowledgement7BELBell8BSBackspace9HTHorizontal Tab10LFLine Feed11VTVertical Tab12FFForm Feed13CRCarriage Return14SOShift Out15SIShift In16DLEData Link Escape17DC1Device Control 118DC2Device Control 219DC3Device Control 320DC4Device Control 421NAKNegative Acknowledgement22SYNSynchronous Idle23ETBEnd of Trans Block24CANCancel25EMEnd of Medium26SUBSubstitute27ESCEscape28FSFile Separator29GSGroup Separator30RSRecord Separator31USUnit Separator

Printable Characters

ASCII ValueCharacter32Space33!34“35#36$37%38&3940(41)42≠43+44,45–46.47/48049150251352453554655756857958:59;60<61=62>63?64@65A66B67C68D69E70F71G72H73I74J75K76L77M78N79O80P81Q82R83S84T85U86V87W88X89Y90Z91[92|93]94^95_96՝97a98b99c100d101e102f103g104h105i106j107k108l109m110n111o112p113q114r115s116t117u118v119w120x121y122z123{124|125}126〃127DEL

(DEL is also a control character.)

Character Equivalence

Here are all the character sets in ASCII. The table below displays all the character’s hexadecimal, decimal, and octal values:

CharacterOctDecHex\00000х0\0010110х1\0020220х2\0030330х3\0040440х4\0050550х5\0060660х6\0070770х7\b01080х8\t01190х9\n012100хA\v013110хB\f014120хC\r015130хD\016016140хE\017017150хF\020020160х10\021021170х11\022022180х12\023023190х13\024024200х14\025025210х15\026026220х16\027027230х17\030030240х18\031031250х19\032032260х1A\033033270х1B\034034280х1C\035035290х1D\036036300х1E\037037310х1F

(space)040320х20!041330х21“042340х22#043350х23$044360х24%045370х25&046380х26\047390х27(050400х28)051410х29052420х2A+053430х2B,054440х2C_055450х2D.056460х2E/057470х2F

0060480х301061490х312062500х323063510х334064520х345065530х356066540х367067550х378070560х389071570х39|072580х3A;073590х3B<074600х3C=075610х3D>076620х3E?077630х3F

CharacterOctDecHex__0100640х40A0101650х41B0102660х42C0103670х43D0104680х44E0105690х45F0106700х46G0107710х47H0110720х48I0111730х49J0112740х4AK0113750х4BL0114760х4CM0115770х4DN0116780х4EO0117790х4FP0120800х50Q0121810х51R0122820х52S0123830х53T0124840х54U0125850х55V0126860х56W0127870х57X013080х58Y0131890х59Z0132900х5A[0133910х5B\0134920х5C]0135930х5D^0136940х5E_0137950х5F‘0140960х60a0141970х61b0142980х62c0143990х63d01441000х64e01451010х65f01461020х66g01471030х67h01501040х68i01511050х69j01521060х6Ak01531070х6Bl01541080х6Cm01551090х6Dn01561100х6Eo01571110х6Fp01601120х70q01611130х71r01621140х72s01631150х73t01641160х74u01651170х75v01661180х76w01671190х77x01701200х78y01711210х79z01721220х7A{01731230х7B|01741240х7C}01751250х7D~01761260х7E\17701771270х7F

Practice Problems on Character Set in C

1. Which of these is a type of character set used in the C language?

A. Digits

B. Alphabets Characters

D. All of the above

Answer – D. All of the above

2. What types of alphabets does the language support?

A. Lowercase Alphabets and Characters

B. Uppercase Alphabets and Characters

C. All of the above

Answer – C. All of the above

3. How many characters does the C programming language support in total?

A. 52

B. 26

C. 256

D. 86

Answer – C. 256

4. How many digits do the C programming language support as character sets?

A. Nine

B. Eight

C. Five

D. Ten

Answer – D. Ten


FAQs

What constitutes the white spaces in the C language?

In the C programming language, the white spaces contain the following:

  • Blank Spaces
  • Carriage Return
  • Tab
  • New Line

What are ASCII values in C?

All the character sets used in the C language have their equivalent ASCII value. The ASCII value stands for American Standard Code for Information Interchange value. It consists of less than 256 characters, and we can represent these in 8 bits or even less.
However, a majority of the ANSI-compatible compilers in C accept these ASCII characters for both the character sets- the source and the execution. Every ASCII character will correspond to a specific numeric value.

What is wchat_t?

The ASCII consists of less than 256 characters, and we can represent these in 8 bits or even less. But we use a special type for accommodating and representing the larger sets of characters. These are called the wide-character type or wchat_t.

What is the use of special characters if we have digits in the C language?

We use some special characters in the C language for some special purposes, such as logical operations, mathematical operations, checking of conditions, backspaces, white spaces, etc.
We can also use these characters for defining the identifiers in a much better way. For instance, we use underscores for constructing a longer name for a variable, etc.

Keep learning and stay tuned to get the latest updates on GATE Exam along with GATE Eligibility Criteria, GATE 2023, GATE Admit Card, GATE Syllabus for CSE (Computer Science Engineering), GATE CSE Notes, GATE CSE Question Paper, and more.

What is valid characters in C?

Summary of Special Characters in C.

What are characters used for in programming?

In computer science, a character is a display unit of information equivalent to one alphabetic letter or symbol. This relies on the general definition of a character as a single unit of written speech. Character can also be abbreviated as "chr" or "char."

Is a valid set of character used in a Java language?

 A character represents any letter, digit or any other sign  Java uses the Unicode character set.  Unicode is a two-byte character code set that has characters representing all characters in almost all languages and writing systems around the world.

How to use character in C programming?

A Character variable in C programming can store any single character enclosed within single quotes. To declare a variable of type Character we use the keyword char (pronounced as kar).