Tuesday, August 7, 2007

Introduction to .Net Framework and Components.....Article 3



.NET Framework and Components


“.NET” COMPONENTS

Base Class Library:
Included with the .NET Framework is a set of .NET Framework Class Library (FCL) assemblies that contains several thousand type definitions, where each type exposes some functionality. All in all, the CLR and the FCL allow developers to build the following kinds of applications:
1. XML Web services Methods that can be accessed over the Internet very easily. XML Web services are, of course, the main thrust of Microsoft’s .NET initiative.
2. Web Forms HTML-based applications (Web sites). Typically, Web Forms applications will make database queries and Web service calls, combine and filter the returned information, and then present that information in a browser using a rich HTML-based user interface. Web Forms provides a Visual Basic 6 and Visual InterDev style development environment for Web applications written in any CLR language.
3. Windows Forms Rich Windows GUI applications. Instead of using a Web Forms page to create our application’s UI, we can use the more powerful, higher performance functionality offered by the Windows desktop. Windows Forms applications can take advantage of controls, menus, and mouse and keyboard events, and they can talk directly to the underlying operating system. Like Web Forms applications, Windows Forms applications also make database queries and call XML Web services. Windows Forms provides a Visual Basic 6Ð like development environment for GUI applications written in any CLR language.
4. Windows console applications For applications with very simple UI demands, a console application provides a quick and easy way to build an application. Compilers, utilities, and tools are typically implemented as console applications.
5. Windows services It is possible to build service applications controllable via the Windows Service Control Manager (SCM) using the .NET Framework.
6. Component library The.NET Framework allows you to build stand-alone components (types) that can be easily incorporated into any of the previously mentioned application types. Because the FCL contains literally thousands of types, a set of related types is presented to the developer within a single namespace. For example, the System namespace contains the Object base type, from which all other types ultimately derive. In addition, the System namespace contains types for integers, characters, strings, exception handling, and console I/O as well as a bunch of utility types that convert safely between data types, format data types, generate random numbers, and perform various math functions. All applications will use types from the System namespace. To access any of the platform’s features, we need to know which namespace contains the types that expose the facilities we’re after. If we want to customize any type’s behavior, we can simply derive our own type from the desired FCL type. The object-oriented nature of the platform is how the .NET Framework presents a consistent programming paradigm to software developers. Also, developers can easily create their own namespaces containing their own types. These namespaces and types merge seamlessly into the programming paradigm. Compared to Win32 programming paradigms, this new approach greatly simplifies software development.
Most of the namespaces in the FCL present types that can be used for any kind of application. Table3 lists some of the more general namespaces and briefly describes what the types in that namespace are used for.



Namespace Description of Contents

System All the basic types used by every Application
System.Collections Types for managing collections of objects; includes the
popular collection types, such as stacks, queues, hash
tables, and so on
System.Diagnostics Types to help instrument and debug Applications
System.Drawing Types for manipulating 2-D graphics; typically used for
Windows Forms applications and for creating images that
are to appear in a Web Forms page
System.EnterpriseServices Types for managing transactions, queued components,
object pooling, JIT activation, security, and other
features to make the use of managed code more efficient
on the server
System.Globalization Types for National Language Support (NLS), such as string
compares, formatting, and calendars
System.IO Types for doing stream I/O, walking directories and files
System.Management Types used for managing other computers in the enterprise
via Windows Management Instrumentation (WMI)
System.Net Types that allow for network Communications
System.Reflection Types that allow the inspection of metadata and late
binding to types and their members
System.Resources Types for manipulating external data Resources
System.Runtime Types that allow managed code to access unmanaged OS
.InteropServices platform facilities such as COM components and functions
in Win32 DLLs
System.Runtime.Remoting Types that allow for types to be accessed Remotely
System.Runtime Types that allow for instances of objects to
.Serialization be persisted and regenerated from a stream
System.Security Types used for protecting data and Resources
System.Text Types to work with text in different encodings, such as
ASCII or Unicode
System.Threading Types used for asynchronous operations and synchronizing
access to resources
System.Xml Types used for processing XML schemas and data


Table3: General Namespaces

In addition to the more general namespaces, the FCL also offers namespaces whose types are used for building specific application types. Table 4 lists some of the application-specific namespaces in the FCL.

Namespace Application Type
System.Web.Services Types used to build XML Web services
System.Web.UI Types used to build Web Forms
System.Windows.Forms Types used to build Windows GUI applications controllable
by the SCM
System.ServiceProcess Types used to build a Windows service

Table4: Application Specific Namespaces

The Common Type System:
Types expose functionality to our applications and components. Types are the mechanism by which code written in one programming language can talk to code written in a different programming language. Because types are at the root of the CLR, Microsoft created a formal specification—the Common Type System (CTS)—that describes how types are defined and how they behave. The CTS specification states that a type can contain zero or more members.
1. Field: A data variable that is part of the object’s state. Fields are identified by their name and type.
2. Method: A function that performs an operation on the object, often changing the object’s state. Methods have a name, a signature, and modifiers. The signature specifies the calling convention, the number of parameters (and their sequence), the types of the parameters, and the type of value returned by the method.
3. Property: To the caller, this member looks like a field. But to the type implementer, it looks like a method (or two). Properties allow an implementer to validate input parameters and object state before accessing the value and/or calculate a value only when necessary. They also allow a user of the type to have simplified syntax. Finally, properties allow you to create read-only or write-only “fields.”
4. Event: An event allows a notification mechanism between an object and other interested objects. For example, a button could offer an event that notifies other objects when the button is clicked.
The CTS also specifies the rules for type visibility and for access to the members of a type.For example, marking a type as public (called public) exports the type, making it visible and accessible to any assembly. On the other hand, marking a type as assembly (called internal in C#) makes the type visible and accessible to code within the same assembly only. Thus, the CTS establishes the rules by which assemblies form a boundary of visibility for a type, and the CLR enforces the visibility rules. Regardless of whether a type is visible to a caller, the type gets to control whether the caller has access to its members.

The following list shows the valid options for controlling access to a method or a field:
1. Private: The method is callable only by other methods in the same class type.
2. Family: The method is callable by derived types, regardless of whether they are within the same assembly. Note that many languages (such as C++ and C#) refer to family as protected.
3. Family and assembly: The method is callable by derived types, but only if the derived type is defined in the same assembly. Many languages (such as C# and Visual Basic) don’t offer this access control. Of course, IL Assembly language makes it available.
4. Assembly: The method is callable by any code in the same assembly. Many languages refer to assembly as internal.
5. Family or assembly: The method is callable by derived types in any assembly. The method is also callable by any types in the same assembly. C# refers to family or assembly as protected internal.
6. Public: The method is callable by any code in any assembly. In addition, the CTS defines the rules governing type inheritance, virtual functions, object lifetime, and so on. These rules have been designed to accommodate the semantics expressible in modern-day programming languages. In fact, we won’t even need to learn the CTS rules , since the language you choose will expose its own language syntax and type rules in the same way we’re familiar with today and will map the language-specific syntax into the “language” of the CLR when it emits the managed module.

The Common Language Specification:
.net framework admirably supports language interoperability .Common language Specification means “many languages for one platform (In contrast to Java i.e. “one language for many platforms”). For now it may be true that Windows is a only operating system but for the programmers of the many languages that support .net, CLS is a maior breakthrough.
By writing “CLS-compliant” code we construct classes and components that can be used by any language and its respective IDEs and development tools, without the need for complex COM and activeX interfaces and registration details.
The CLS is really a subset of the common type system. All the rules specified by the Common ype System in the Runtime environment such as type safety, determine how the CLS governs compliance at the code construction and compilation levels. The CTS protects the integrity of code by ensuring type safety; code constructs that risk type safety are excluded from the CLS. As long as we produce CLS compliant code it will be verified by CTS.
Following table5 provides an abridged list of software development features that must meet CLS compliance rules and indicates whether the feature applies to both developers and compilers or only to compilers.


Feature Application CLS compliance

General All Visibility and exposure; types that are
exposed need to be compliant,but global
static fields and methods do not
Naming All Characters and casing. Keywords names must be
unique and signatures muat ensire that return
and parameter types are compliant.
Types All Fundamental types such as integer, Boolean,
double and so on.
Type members All Overloading, uniqueness and conversion
operations.
Methods All Accessibility and calling conventions and
parameter lists.
Properties All Accesser metadata, accessibility,
modification, naming and parameters
Events All Event methods and metadata,
accessibility,modification, naming and
parameters
Pointers All Pointers are not compliant.
Reference types All Construction and invocation.
(objects)
Class types All Inheritance from atleast one compliant class
Arrays All Elements, dimensions and bounds
Enumerations All Underlying types, the flags attributes and
field members
Exceptions All Must derive from the base system.exception
class
Custom attributes All Value encoding
Metadata Compilers Compliance marking
Interfaces All Signatures and modification

Table5: Abridged Version of the CLS

The CLS includes the language constructs that are needed by developers of all .net languages. That may seem impossible, but the specification is not too big or complex for a .net language to support. After all, many of the languages at the source code level are far different for instance Smalltalk, Pascal, C# etc. Following are some of the benefits of CLS:
1. Classes produced in one language can be inherited by ones used in other language.
2. Objects instantiated from the classes of a sender return in one language can be passed to the methods of receiver objects whose classes were created in other languages. The receiving objects accept our arguments and process them as if they were written in the same language as the receiver.
3. Exception handling, tracing and profiling are language agnostic; we can debug across languages and even across processes. Exceptions can be raised in an object from one language and understand by an object created in another language.



The Common Language Runtime:
As its name suggests, the common language runtime (CLR) is a runtime that is usable by different and varied programming languages. The features of the CLR are available to any and all programming languages that target it—period. If the runtime uses exceptions to report errors, then all languages get errors reported via exceptions. If the runtime allows you to create a thread, then any language can create a thread. In fact, at runtime, the CLR has no idea which programming language the developer used for the source code. This means that any programming language that allows us to implement and develop our application i.e. we can develop our code in any programming language we desire as long as the compiler we use to compile your code targets the CLR.
Figure2 shows the process of compiling source code files. As the figure shows, you can create source code files using any programming language that supports the CLR. Then you use the corresponding compiler to check the syntax and analyze the source code. Regardless of which compiler you use, the result is a managed module. A managed module is a standard Windows portable executable (PE) file that requires the CLR to execute. In the future, other operating systems may use the PE file format as well.



Figure2: Compiling source code into managed modules


Table6 describes the parts of a managed module.


Part Description

PE header The standard Windows PE file header, which is similar to the
Common Object File Format (COFF) header. This header indicates
the type of file: GUI, CUI, or DLL, and it also has a timestamp
indicating when the file was built. For modules that contain
only IL code, the bulk of the information in the PE header is
ignored. For modules that contain native CPU code, this header
contains information about the native CPU code.
CLR header Contains the information (interpreted by the CLR and utilities)
that makes this a managed module. The header includes the
version of the CLR required, some flags, the MethodDef metadata
token of the managed module’s entry point method (Main method),
and the location/size of the module’s metadata, resources,
strong name, some flags, and other less interesting stuff.
Metadata Every managed module contains metadata tables. There are two
main types of tables: tables that describe the types and
members defined in your source code and tables that describe
the types and members referenced by your source code.
Intermediate Code that the compiler produced as it compiled the source code.
language (IL) The CLR later compiles the IL into native CPU instructions.
Code


Table6: Parts of Managed Modules

Metadata:
Most compilers of the past produced code targeted to a specific CPU architecture, such as x86, IA64, Alpha, or PowerPC. All CLR-compliant compilers produce IL code instead. IL code is sometimes referred to as managed code because the CLR manages its lifetime and execution.
In addition to emitting IL, every compiler targeting the CLR is required to emit full metadata into every managed module. In brief, metadata is simply a set of data tables that describe what is defined in the module, such as types and their members. In addition, metadata also has tables indicating what the managed module references, such as imported types and their members. Thus, metadata is nothing but data about the data. Metadata is a superset of older technologies such as type libraries and interface definition language (IDL) files. The important thing to note is that CLR metadata is far more complete. And, unlike type libraries and IDL, metadata is always associated with the file that contains the IL code. In fact, the metadata is always embedded in the same EXE/DLL as the code, making it impossible to separate the two. Because the compiler produces the metadata and the code at the same time and binds them into the resulting managed module, the metadata and the IL code it describes are never out of sync with one another. Metadata has many uses. Here are some of them:
1. Metadata removes the need for header and library files when compiling since all the information about the referenced types/members is contained in the file that has the IL that implements the type/members. Compilers can read metadata directly from managed modules.
2. Visual Studio .NET uses metadata to help us write code. Its IntelliSense feature parses metadata to tell us what methods a type offers and what parameters that method expects.
3. The CLR’s code verification process uses metadata to ensure that your code performs only “safe” operations.
4. Metadata allows an object’s fields to be serialized into a memory block, remoted to another machine, and then deserialized, re-creating the object and its state on the remote machine.
5. Metadata allows the garbage collector to track the lifetime of objects. For any object, the garbage collector can determine the type of the object and, from the metadata, know which fields within that object refer to other objects. Thus, a programmer is relieved from the overhead of memory management and checking memory leakage.

Combining Managed Modules into Assemblies:
The CLR doesn’t actually work with modules; it works with assemblies. An assembly is an abstract concept that can be difficult to grasp initially. First, an assembly is a logical grouping of one or more managed modules or resource files. Second, an assembly is the smallest unit of reuse, security, and versioning. Depending on the choices we make with our compilers or tools, we can produce a single-file or a multifile assembly.
Figure3 should help explain what assemblies are about. In this figure, some managed modules and resource (or data) files are being processed by a tool. This tool produces a single PE file that represents the logical grouping of files. What happens is that this PE file contains a block of data called the manifest. The manifest is simply another set of metadata tables. These tables describe the files that make up the assembly, the publicly exported types implemented by the files in the assembly, and the resource or data files that are associated with the assembly.



Figure3: Combining managed modules into assemblies

By default, compilers actually do the work of turning the emitted managed module into an assembly; An assembly allows you to decouple the logical and physical notions of a reusable, deployable, versionable component. Assemblies allow you to break up the deployment of the files while still treating all the files as a single collection. An assembly’s modules also include information, including version numbers, about referenced assemblies. This information makes an assembly self-describing. In other words, the CLR knows everything about what an assembly needs in order to execute. No additional information is required in the registry or in Active Directory. Because no additional information is needed, deploying assemblies is much easier than deploying unmanaged components.

Executing Assembly’s Code:
As mentioned earlier, managed modules contain both metadata and intermediate language (IL). IL is a CPU-independent machine language created by Microsoft. IL is much higher level than most CPU machine languages. IL understands object types and has instructions that create and initialize objects, call virtual methods on objects, and manipulate array elements directly. It even has instructions that throw and catch exceptions for error handling. IL can be considered as an object-oriented machine language. Usually, developers will program in a high-level language, such as C# or Visual Basic. The compilers for these high-level languages produce IL. However, like any other machine language, IL can be written in assembly language, and Microsoft does provide an IL Assembler, ILAsm.exe. Microsoft also provides an IL Disassembler, ILDasm.exe.

IL and Protection of Intellectual Property:
It is often doubted that IL doesn’t offer enough intellectual property protection for algorithms. In other words, any one can easily reverse engineer our managed module by using tool, such as IL Disassembler.
Yes, it’s true that IL code is higher level than most other assembly languages and that, in general, reverse engineering IL code is relatively simple. However, when implementing an XML Web service or a Web Forms application, our managed module resides on our server. Because no one except us can access the module, no one can use any tool to see the IL—our intellectual property is completely safe.
If we’re concerned about any of the managed modules that we distribute, we can obtain an obfuscator utility from a third-party vendor. These utilities “scramble” the names of all the private symbols in our managed module’s metadata. It will be difficult for someone to “unscramble” the names and understand the purpose of each method. Note that these obfuscators can only provide a little protection since the IL must be available at some point in order for the CLR to process it. If we don’t feel that an obfuscator offers the kind of intellectual property protection that we desire, we can consider implementing our more sensitive algorithms in some unmanaged module that will contain native CPU instructions instead of IL and metadata.Then we can use the CLR’s interoperability features to communicate between the managed and unmanaged portions of our application. Of course, this assumes that we’re not worried about people reverse engineering the native CPU instructions in our unmanaged code.

JIT Compiler:
Even though today’s CPUs can’t execute IL instructions directly, CPUs of the future might have this capability. To execute a method, its IL must first be converted to native CPU instructions. This is the job of the CLR’s JIT (just-in-time) compiler. Figure 4 shows what happens the first time a method is called.



Figure4 : Calling a method for the first time
Just before the Main method executes, the CLR detects all the types that are referenced by Main’s code. This causes the CLR to allocate an internal data structure that is used to manage access to the referenced type. In Figure 1-4, the Main method refers to a single type, Console, causing the CLR to allocate a single internal structure. This internal data structure contains an entry for each method defined by the type. Each entry holds the address where the method’s implementation can be found. When initializing this structure,
the CLR sets each entry to an internal, undocumented function contained inside the CLR itself. This function is called JIT Compiler. When Main makes its first call to WriteLine, the JIT Compiler function is called. The JIT Compiler function is responsible for compiling a method’s IL code into native CPU
instructions. Because the IL is being compiled "just in time," this component of the CLR is frequently referred to as a JITter or a JIT compiler. When called, the JIT Compiler function knows what method is being called and what type defines this method. The JIT Compiler function then searches the defining assembly’s metadata for the called method’s IL. JIT Compiler next verifies and compiles the IL code into
native CPU instructions. The native CPU instructions are saved in a dynamically allocated block of memory. Then, JIT Compiler goes back to the type’s internal data structure and replaces the address of the called method with the address of the block of memory containing the native CPU instructions. Finally, JIT Compiler jumps to the code in the memory block. This code is the implementation of the WriteLine method (the version that takes a String parameter). When this code returns, it returns to the code in Main, which continues execution as normal.
Main now calls WriteLine a second time. This time, the code for WriteLine has already been verified and compiled. So the call goes directly to the block of memory, skipping the JIT Compiler function entirely. After the WriteLine method executes, it returns to Main. Figure5 shows what the situation looks like when WriteLine is called the second time. A performance hit is incurred only the first time a method is called. All subsequent calls to the method execute at the full speed of the native code: verification and compilation to native code are not performed again.



Figure5 : Calling a method for the second time

We should also be aware that the CLR’s JIT compiler optimizes the native code just as the back-end of an unmanaged C++ compiler does. Again, it may take more time to produce the optimized code, but the code will execute with much better performance than if it hadn’t been optimized.
Many people think that managed applications could actually outperform unmanaged applications. There are many reasons to believe this. For example, when the JIT compiler compiles the IL code into native code at run time, the compiler knows more about the execution environment than an unmanaged compiler would know.
Here are some ways that managed code could outperform unmanaged code:
1. A JIT compiler could detect that the application is running on a Pentium 4 and produce native code that takes advantage of any special instructions offered by the Pentium 4.
Usually, unmanaged applications are compiled for the lowest-common-denominator CPU and avoid using special instructions that would give the application a performance boost over newer CPUs.
2. A JIT compiler could detect that a certain test is always false on the machine that it is running on. For example, consider a method with code like this:
§if (numberOfCPUs > 1) {
?
}
This code could cause the JIT compiler not to generate any CPU instructions if the host machine has only one CPU. In this case, the native code has been fine-tuned for the host machine: the code is smaller and executes faster.
3. The CLR could profile the code’s execution and recompile the IL into native code while the application runs. The recompiled code could be reorganized to reduce incorrect branch predictions depending on the observed execution patterns. These are only a few of the reasons why future managed code can execute better than today’s unmanaged code. The performance is currently quite good for most applications, and it promises to improve as time goes on.
If our experiments show that the CLR’s JIT compiler doesn’t offer our application the kind of performance it requires, we may want to take advantage of the NGen.exe tool that ships with the .NET Framework SDK. This tool compiles all assembly’s IL code into native code and saves the resulting native code to a file on disk. At run time, when an assembly is loaded, the CLR automatically checks to see whether a precompiled version of the assembly also exists, and if it does, the CLR loads the precompiled code so that no compilation at run time is required.

References

THE COMPLETE REFERENCE VISUAL BASIC .NET- By Jeffery R. Shapiro
INTRODUCING .NET- By Wrox publication
COURSE MATERIAL ON .NET- By KICIT
LEARN TO PROGRAM WITH VISUAL BASIC .NET –By John Smiley

No comments: