You Want to Do What with PHP?

You Want to Do What with PHP?

by Kevin Schroeder
You Want to Do What with PHP?

You Want to Do What with PHP?

by Kevin Schroeder

eBook

$37.49  $49.95 Save 25% Current price is $37.49, Original price is $49.95. You Save 25%.

Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.
WANT A NOOK?  Explore Now

Related collections and offers

LEND ME® See Details

Overview

A creative look at the fundamentals of PHP programming, this manual shows practical but atypical examples of PHP code. Theories, considerations, and varying options—such as binary protocols, sharing data, extending PHP with Java, and scaling PHP applications with messaging—are presented as new approaches to solving problems. Taking into consideration operating system level options, this programming reference goes beyond the basics to offer more advanced and innovative options for building PHP applications.


Product Details

ISBN-13: 9781583475607
Publisher: Mc Press
Publication date: 10/15/2010
Sold by: Barnes & Noble
Format: eBook
Pages: 400
File size: 23 MB
Note: This product may take a few minutes to download.

About the Author

Kevin Schroeder is the technical consultant for Zend Technologies, where he works on small and large web application deployments and is the primary subject-matter expert for their training courses. He has developed production software using PHP and several other languages and has extensive experience in system administration on Linux, Solaris, and Windows. He is the coauthor of The IBM i Programmer's Guide to PHP. He lives in Dallas, Texas.

Read an Excerpt

You Want To Do What with PHP?


By Kevin Schroeder

MC Press Online, LLC

Copyright © 2010 Kevin Schroeder
All rights reserved.
ISBN: 978-1-58347-560-7



CHAPTER 1

Networking and Sockets


You can think of this chapter as a foundational chapter in this book. Several other chapters will rely on what you learn here. This chapter also matters simply because although a lot of stuff happens on the network, PHP developers are often insulated from that stuff. For example, PHP's file_get_contents() function can return, via the streams API, the contents of a remote Web site just as if it were a local file. The statement:

$data = file_get-contents('test.php');

functions the same way as

$data - file_get_contents('http://www.php.net/');

Thanks to the streams API, this type of functionality is identical regardless of the data source, as long as your distribution of PHP supports the specified protocol.

This approach has two upsides: it makes your code much simpler, and it makes your application much faster. Because the entire core PHP functionality is written in C, you get the performance benefit of running compiled code, and compiled C code is always going to be faster than PHP code (unless you've royally messed up your C).

The downside of this approach is that you don't get as much exposure to the underlying layer. In most situations that's not a problem, but occasionally you will need to know some of what goes on "under the covers."


The OSI Model

The best place to start is at the beginning, so let's start there. From your point of view as a PHP programmer, the beginning is what is known as "layer 7."

What I'm talking about here is the Open Systems Interconnection (OSI) model. The OSI model is an abstract concept that defines various layers within a network's architecture. PHP typically takes care of network communications for you, even on the lower layers, but knowing what those network layers are can help you understand why your application does what it does.


The OSI Layers

Figure 1.1 lists the layers defined by the OSI model.

An important point to note about these layers is that each is built on the other. For example, layer 6 is responsible for transferring data from layer 5 to layer 7, and vice versa. In a typical networked PHP application, you send data from layer 7, the application layer. Layer 7 then passes the data to layer 6, which passes it to layer 5, and so on right down to layer 1. Then, on the other side of the application, when the data is received, layer 1 passes it to layer 2, which eventually bubbles it up to layer 7, where the client (or server) then receives it.

This is not a perfect description, nor is the transition between the OSI layers always perfect. For example, HTTP, being stateless, seems to miss layer 5, the session layer. For this chapter, however, we'll be concerning ourselves primarily with layers 3 and 4, and the preceding explanation is sufficient for the purpose of our discussion.

Let's take a closer look now at each layer of the OSI model.


Layer 7: Application Layer

The OSI layer with which you're probably most familiar is layer 7, the application layer. This layer contains the protocols that supply network services to your applications, such as Dynamic Host Configuration Protocol (DHCP), Hypertext Transfer Protocol (HTTP), Internet Message Access Protocol (IMAP), and Post Office Protocol 3 (POP3). Secure Sockets Layer (SSL) is also part of layer 7.

One way to think of layer 7 is as the last endpoint before data is handled in your application. If you are using a custom protocol or have built a custom handler for an existing protocol, your application will be delving into layer 7.

Layer 7 can also be seen as the payload that all the other protocols are working together to send.


Layer 6: Presentation Layer

You can think of the presentation layer as the raw data that will be passed. While layer 7 provides the structure (e.g., HTTP) that the request needs to follow, layer 6 defines the actual data that is going to be handed off. In other words, layer 6 defines the format of the data that a layer 7 protocol needs to use. ASCII is an example of a layer 6 protocol, as are Unicode and 8-bit Unicode Transformation Format (UTF-8).


Layer 5: Session Layer

From the point of view of HTTP, the session layer, which manages sessions between application processes, is unnecessary. The HTTP protocol is stateless, which means that individual requests don't know anything about any previous or subsequent HTTP requests. There are obviously workarounds, such as sessions, but the network architecture does not handle sessions.

Typical protocols that operate on the session layer are the Network Basic Input/Output System (NetBIOS), Secure Copy (SCP), and Secure Shell (SSH). On Windows networks, Server Message Block (SMB) — also known as the Common Internet File System (CIFS) — is the protocol typically used for file and print sharing, and SMB is also used by the open-source Samba project. SMB is an example of a layer 6 protocol that uses a layer 5 protocol, NetBIOS, to communicate.


Layer 4: Transport Layer

The transport layer is where we will spend much of this chapter. It is here that the Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) reside. There is a separate TCP/IP model that could be seen as a competitor to OSI, but because we are going to look at the Internet Control Message Protocol (ICMP) and also talk a bit about Internet Protocol (IP), using the TCP/IP model did not seem appropriate for this discussion.

The transport layer takes data provided by the session layer or the presentation layer and encapsulates it in a format that can be sent to the network layer or extracted from the network layer to be passed to the session or presentation layer.

Among the responsibilities of the transport layer are:

• Ports: Virtual endpoints for a packet on a destination machine

• Flow control: Keeping the application from flooding the receiving machine with data

• Error detection and correction

Protocols on the transport layer are not required to handle any of these functions, but many, such as TCP, do.


Layer 3: Network Layer

The network layer is responsible for the actual delivery and routing of transport-layer–level packets to an endpoint. Note that your computer, when connected to a network, does not have a "TCP address"; it has an "IP address." Once an IP packet has found its way to its destination IP address, the data bubbles up to the transport layer, where the port information is handled.

The network layer doesn't care if you're using TCP, UDP, or some other protocol. All it cares about is getting that packet to its endpoint. This layer is actually closer to UDP than TCP in its implementation. TCP requires a connection, whereas UDP does not. UDP is "fire and forget" (with some exceptions). IP is the same way. As long as the packet reaches its destination without error, IP is happy. If a connection is opened to a closed port using TCP, the network layer is indifferent about it. All it knows is that a packet was returned; it doesn't care that the TCP protocol on the server returned a TCP reset to a requested connection on a closed port. All IP cares about is that data is being sent to and from an IP address.


Layer 2: Data Link Layer

The data link layer is responsible for sending data only between devices on the same local area network. Ethernet is an example of a layer 2 protocol; so are Point-to-Point Protocol (PPP) and Serial Line Interface Protocol (SLIP). Media Access Control (MAC) is a sublayer within the data link layer. The MAC protocol is used so that a packet destined for a specific device will be ignored by all other devices on the network.

It is that specificity that enables a hub to work. When a packet comes in to a hub on one port, the hub sends it on to all the other connected ports. The Ethernet devices on those ports will ignore the packet, except for the device for which the packet is destined.

Like a hub, a switch is a data link layer device, but rather than simply broadcasting all packets to all connected devices, a switch learns which MAC addresses are on which physical ports. Based on that information, the switch relays packets only to the physical port on which a given MAC address is known to exist.

Routers are much more intelligent than either hubs or switches. They let you subdivide networks, route packets differently, or choose not to route at all, as is the case with a firewall.

What you will actually notice is that if you are on a multiple-hop network, where your packet is routed through multiple routers, the packet will jump back and forth between layers 2 and 3. In a moment, we'll look at a chart that will help you visualize this behavior.


Layer 1: Physical Layer

I'm not even going to try to explain how the physical layer works because I'd get it wrong. Simply put, what this layer does is facilitate the raw transmission of bits from one machine to another.


Putting It Together

Having reviewed the various layers with some very basic descriptions of what goes on in each one, let's take a quick look at the life of a packet as it goes from your Web server to a browser. Figure 1.2 depicts the flow of this activity.

What happens on the server and the client probably makes a fair amount of sense. However, the router part needs a little explaining. First of all, the middle "Switch/Router" portion of this diagram could be duplicated several times, once for each router through which the packet must pass. But what's actually going on in the router's mind when a packet arrives? Here is that conversation:

* * *

Here I am listening on my MAC address 32:95: f6:7a:af:97 (192.168.0.1) and 5d:a4:05:67: e4:5f (192.168.1.1).

What's this? A packet on 32:95:f6:7a:af:97? I love getting packets. Ah! It's from d4:da:01:3f:94:a5. Let's take a look at that IP address. 192.168.0.5? I can handle that! Let's see, where is this packet supposed to go ... 192.168.1.42? Hmm, that's on the 192.168.1.0/24 network.

I know where this packet is supposed to go, so I'll just change the source MAC address to my 5d:a4:05:67:e4:5f. And I know the MAC address to the computer on the 192.168.1.0/24 network (she and I had coffee last week). It's 23:87:82:cc:ab:59. So let's change the destination MAC address to hers.

Almost there. I still have this data, so let's dump that into the new packet and send it on its way. Happy trails, little fella!

* * *


The elapsed time for this conversation is most likely measured in nanoseconds.

Routing

In the preceding example, you saw how a router might decide where to send an individual packet. However, you did not see the rules the router used to determine where to send the packet. Those rules are what constitute routing, and while routing is relatively simple for smaller networks, it's a bit of a black art when you're dealing with complex networks.

The thing is, when people think of routing, they often think of how routing is done on the Internet, where multiple routers are all trying to figure out the best way to get an individual packet to where it needs to go. Your local network does not actually work like this. There is no auto-discovery on your local network. At least there shouldn't be. I have never run into a situation where a PHP developer needed to understand a wide area network. But local area networks are quite common. Usually, you won't have to worry about doing too much routing because developers technically should be grouped on their own class C network. However, there may be times when a sandbox or quality assurance (QA) area is needed that should have special rules in place to make sure packets don't get into the wild.

I just referred to a "class C" network. As you are probably aware, only a limited number of IP addresses are available. An IP address, for Internet Protocol Version 4 (IPv4), is a 32-bit number, typically separated into four octets. (I won't be getting into IPv6.) Because each IP address is limited to 32 bits, the maximum number of IP addresses is 4,294,967,296 or 2.

As you can imagine, simply claiming ownership over certain IP addresses wouldn't work too well. You could do it for your own organization, but as soon as you wanted to communicate with someone else, you would have no guarantee that there would not be an IP address collision somewhere. In other words, you would have no guarantee, for example, that 56.164.2.32 did not exist somewhere else.

This is where the Internet Assigned Numbers Authority (IANA) steps in. This organization is responsible for assigning IP address ranges to various organizations or regions. Typically, it does so by region, although legacy rules permit several private organizations to have their own top-level IP address space, but this is not done any more except for specific needs. The current list of network assignments is available at http://www.iana.org/assignments/ipv4-address-space/ ipv4-address-space.xml. This list, as you can see if you check it out, contains only class A networks that the IANA has assigned.

There's that word "class" network again. There are five different network classes, only three of which you will be concerned about. Table 1.1 summarizes IP network classes A, B, and C.

These three classes refer to how many bits in the 32-bit address you can allocate. Network classes are not generally used on the Internet any more, having been abandoned in favor of Classless Inter-Domain Routing (CIDR), but many organizations still follow this method.

The size restriction is managed by something called a subnet mask. What the subnet mask does is tell the router which bits of the IP address belong on which network.

We can actually demonstrate this structure using bitwise operations in PHP. Let's assume that there are three class C networks. They will be 192.168.0.0/24, 192.168.1.0/24, and 192.168.2.0/24. There will also be an IP address, 192.168.2.35. Looking at these numbers, it is easy for us to tell which network the IP address is on. But can PHP tell? Not without a fair amount of work, unless we use bitwise operations on the address. Figure 1.3 shows some code you could use to do netmask calculations in PHP.

This code prints out the following result:

192.168.2.35 is on network 192.168.2.0

At this point, you will either be saying "Aha!" or be starting to see it but not quite there yet. In case you are still working through it, let's add some debug information using the decbin() function to the script (Figure 1.4).

Figure 1.5 shows the output that results.

The first line here is what the host looks like in binary notation once it has been compared against the netmask using the bitwise AND (&) operation. What we do after that is take each of our networks and perform the & operation on it; if the network matches the netmasked IP address, we have found our network.

Networking then can often go one step further, using a process called subnetting. Subnetting takes place when you take a class A, B, or C network and split it up into non-defined segments. For example, a class C network can have a maximum of 254 machines on it. But what happens if you need to split that network between departments, say marketing and IT, that really have no business touching each other?

You have two options. The first option is to trust that nobody in IT is ever going to do anything to marketing's network. The other option is to split the class C network into two or more subnets. You do the latter by taking some of the additional bits that are left from the original class of the network and assigning them to one of the departments.

Because you are limited to working with bits, you cannot be specific as to how many IP addresses have been assigned to a given subnet. This can be calculated simply by switching one bit at a time from the end of the 32-bit IP address to a zero. On a class C network, that means that the only possible netmask options are those listed in Table 1.2.

Subnetmask 255.255.255.254 is also available. But because each subnet needs to have one IP address as the network endpoint and one IP address for broadcasting packets to the subnet, the 255.255.255.254 subnetmask is mostly useless to you.

Let's take our original program for calculating netmasks and add support for subnets as well as netmasks other than for class A, B, or C networks. Figure 1.6 shows the code for the modified program.

Figure 1.7 shows the resulting output.

Using this calculation, you can now determine the subnetwork on which an IP address resides.

At this point, you might be asking yourself, "As a Web developer, why am I going through this in the first place?" There are several reasons:

• Because you should. Many PHP developers just assume that the network is there without knowing the underlying concepts. Although it may not always be necessary to have that knowledge, it will make you more aware of network-related issues you may encounter.

• The security concept of Defense in Depth posits that a security implementation should have multiple layers. This design lets you add a networking component to your overall security implementation should you need it.

• You could provide different content for people who are on a specific network (e.g., a competitor).

• By understanding some of the basics of networking, you can build better, faster, more secure, more interesting network services. From Web services to your own binary protocols, an understanding of basic network concepts helps you to write better programs.


(Continues...)

Excerpted from You Want To Do What with PHP? by Kevin Schroeder. Copyright © 2010 Kevin Schroeder. Excerpted by permission of MC Press Online, LLC.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents

Contents

Introduction,
Chapter 1: Networking and Sockets,
Chapter 2: Binary Protocols,
Chapter 3: Character Encoding,
Chapter 4: Streams,
Chapter 5: SPL,
Chapter 6: Asynchronous Operations with Some Encryption Thrown In,
Chapter 7: Structured File Access,
Chapter 8: Daemons,
Chapter 9: Debugging, Profiling, and Good Development,
Chapter 10: Preparing for Success,
Index,

From the B&N Reads Blog

Customer Reviews