Welcome!

Microsoft Cloud Authors: Pat Romanski, Srinivasan Sundara Rajan, Glenn Rossman, Janakiram MSV, Steven Mandel

Blog Feed Post

Performance Tuning Windows 2012: Network Subsystem Part 2

In our previous article we discussed the hardware supported features of some of the high-end network adapters. Let’s take a look how you can use some of those settings to their best advantage. Remember that the correct settings depend on the network adapter, your workload, the resources of the host computer, and of course your performance goals.

Enabling Offload Features

Turning on network adapter offload features typically will benefit performance.However, sometimes the network adapter might not be powerful enough to handle the offload capabilities with high throughput. For example, enabling segmentation offload can reduce the maximum sustainable throughput on some network adapters because of limited hardware resources. However, if the reduced throughput is not expected to be a limitation, you should enable offload capabilities, even for those network adapters. Some network adapters require offload features to be independently enabled for send and receive paths.

Enabling RSS for Web Scenarios

RSS can improve web scalability and performance if you have fewer network adapters than logical processors in your server. When all the web traffic is going through the RSS-capable network adapters, incoming web requests from different connections can be simultaneously processed across different CPUs. Because of the logic in RSS and HTTP for load distribution, performance can be degraded if a non-RSS-capable network adapter accepts web traffic on a server that has one or more RSS-capable network adapters. It is recommended that you use RSS-capable network adapters or disable RSS from the Advanced Properties tab. To determine whether a network adapter is RSS-capable, view the RSS information on the Advanced Properties tab for the device.

RSS Profiles and RSS Queues

RSS Profiles are new in Windows Server 2012. The default profile is NUMA Static, which changes the default behavior from previous versions of Windows. We suggest reviewing the available profiles and understanding when they are beneficial. For example, you can use Task Manager to examine if your logical processors is underutilized for receive traffic, and try increasing the number of RSS queues from the default of 2 to the maximum that your network adapter supports. Changing the number of RSS queues is an option your network adapter may have as part of the driver.

Increasing Network Adapter Resources

For network adapters that allow manual configuration of resources, such as receive and send buffers, you should increase the allocated resources. Some network adapters set their receive buffers low to conserve allocated memory from the host. The low value results in dropped packets and decreased performance. Therefore, for receive-intensive scenarios, it is recommended to increase the receive buffer value to the maximum. If your adapter does not supports or exposes manual configuration, it most likely dynamically configures the resources, or it might be a fixed value that you can’t change.

Enabling Interrupt Moderation

To control interrupt moderation, some network adapters expose different interrupt moderation levels, buffer coalescing parameters (sometimes separately for send and receive buffers), or both. You should consider interrupt moderation for CPU-bound workloads, and consider the trade-off between the host CPU savings and latency versus the increased host CPU savings because of more interrupts and less latency. If the network adapter does not perform interrupt moderation, but it does expose buffer coalescing, increasing the number of coalesced buffers allows more buffers per send or receive, which improves performance.

Workload Specific Tuning

Your network adapter has a number of options to optimize latency caused by the operating system. This latency is the time between the network driver processing an incoming packet and the sending it back, and is usually measured in microseconds. As a comparison, the transmission time for packets transmitted over long distances is usually expressed in milliseconds. This tuning will not reduce the time a packet spends in transit.

Some tuning suggestions for microsecond-sensitive networks include:

· Set the computer BIOS to High Performance, with C-states disabled. However, this is system and BIOS dependent. Some systems provide higher performance if the operating system controls power management. You can check and adjust your power management settings from the Control Panel or by using the powercfg command.

· Set the operating system power management profile to High Performance System. For this to work as expected, your system’s BIOS has to be set to enable operating system control of power management.

· Enable Static Offloads, for example, UDP Checksums, TCP Checksums, and Send Large Offload (LSO)

· Enable RSS if the traffic is multi-streamed, such as high-volume multicast receive

· Disable the Interrupt Moderation setting for network drivers that require the lowest possible latency. The tradeoff is that this can use more CPU time.

· Handle network adapter interrupts and DPCs on a core processor that shares CPU cache with the core that is being used by the program that is handling the packet. CPU affinity tuning can be used to direct a process to certain logical processors in conjunction with RSS configuration to accomplish this. Using the same core for the interrupt, DPC, and user mode thread exhibits worse performance as load increases because the ISR, DPC, and thread contend for the use of the core.

System Management Interrupts

Many hardware systems use System Management Interrupts (SMI) for a variety of maintenance functions, including reporting of ECC memory errors, legacy USB compatibility, fan control, and BIOS controlled power management. The SMI is the highest priority interrupt on the system and places the CPU in a management mode, which preempts all other activity while it runs an interrupt service routine, typically contained in BIOS.

This behavior can result in latency spikes of 100 microseconds or more. If you need to achieve the lowest latency, look for a BIOS version from your hardware provider that reduces SMIs to the lowest degree possible, referred to as “low latency BIOS” or “SMI free BIOS.” It is not possible to eliminate SMI activity altogether because it is used to control some essential functions, such as fan control.

Tuning TCP

TCP Receive Window Auto-Tuning

Prior to Windows Server 2008, the network stack used a fixed-size receive-side window that limited the overall potential throughput for connections. One of the most significant changes to the TCP stack is TCP receive window auto-tuning. You can calculate the total throughput of a single connection when you use this fixed size default as:

Total achievable throughput in bytes = TCP window * (1 / connection latency)

As an example, the achievable throughput is only 51 Mbps on a 1 GB connection with 10ms latency. With auto-tuning, the receive-side window is adjustable, and it can grow to meet the demands of the sender. It is possible for a connection to achieve a full line rate of a 1 GB connection. Network usage requirements that might have been limited in the past by the total achievable throughput of TCP connections can now fully use the network.

Windows Filtering Platform

The Windows Filtering Platform (WFP) that was introduced in Windows Vista and Windows Server 2008 provides APIs to non-Microsoft independent software vendors (ISVs) to create packet processing filters. Examples include firewall and antivirus software. Be careful that a poorly written WFP filter can significantly decrease networking performance.

TCP Parameters

The following registry keywords from Windows Server 2003 are no longer supported, and they are ignored in Windows Server 2012, as well as Windows Server 2008 R2, and Windows Server 2008:

· TcpWindowSize – HKLM\System\CurrentControlSet\Services\Tcpip\Parameters

· NumTcbTablePartitions – HKLM\system\CurrentControlSet\Services\Tcpip\Parameters

· MaxHashTableSize – HKLM\system\CurrentControlSet\Services\Tcpip\Parameters

Network-Related Performance Counters

This section lists the counters that are relevant to managing network performance.

Resource Utilization

· IPv4, IPv6

  • Datagrams Received/sec
  • Datagrams Sent/sec

· TCPv4, TCPv6

  • Segments Received/sec
  • Segments Sent/sec
  • Segments Retransmitted/sec

· Network Interface(*), Network Adapter(*)

  • Bytes Received/sec
  • Bytes Sent/sec
  • Packets Received/sec
  • Packets Sent/sec
  • Output Queue Length

This counter is the length of the output packet queue (in packets). If this is longer than 2, delays occur. You should find the bottleneck and eliminate it if you can. Because NDIS queues the requests, this length should always be 0.

· Processor Information

  • % Processor Time
  • Interrupts/sec
  • DPCs Queued/sec

This counter is an average rate at which DPCs were added to the logical processor’s DPC queue. Each logical processor has its own DPC queue. This counter measures the rate at which DPCs are added to the queue, not the number of DPCs in the queue. It displays the difference between the values that were observed in the last two samples, divided by the duration of the sample interval.

Potential Network Problems

· Network Interface(*), Network Adapter(*)

  • Packets Received Discarded
  • Packets Received Errors
  • Packets Outbound Discarded
  • Packets Outbound Errors

· WFPv4, WFPv6

  • Packets Discarded/sec

· UDPv4, UDPv6

  • Datagrams Received Errors

· TCPv4, TCPv6

  • Connection Failures
  • Connections Reset

· Network QoS Policy

  • Packets dropped
  • Packets dropped/sec

· Per Processor Network Interface Card Activity

  • Low Resource Receive Indications/sec
  • Low Resource Received Packets/sec

· Microsoft Winsock BSP

  • Dropped Datagrams
  • Dropped Datagrams/sec
  • Rejected Connections
  • Rejected Connections/sec

Receive Side Coalescing (RSC) performance

· Network Adapter(*)

  • TCP Active RSC Connections
  • TCP RSC Average Packet Size
  • TCP RSC Coalesced Packets/sec
  • TCP RSC Exceptions/sec
Share Now:del.icio.usDiggFacebookLinkedInBlinkListDZoneGoogle BookmarksRedditStumbleUponTwitterRSS

Read the original blog entry...

More Stories By Hovhannes Avoyan

Hovhannes Avoyan is the CEO of PicsArt, Inc.,

@ThingsExpo Stories
Why do your mobile transformations need to happen today? Mobile is the strategy that enterprise transformation centers on to drive customer engagement. In his general session at @ThingsExpo, Roger Woods, Director, Mobile Product & Strategy – Adobe Marketing Cloud, covered key IoT and mobile trends that are forcing mobile transformation, key components of a solid mobile strategy and explored how brands are effectively driving mobile change throughout the enterprise.
19th Cloud Expo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterpri...
Fact is, enterprises have significant legacy voice infrastructure that’s costly to replace with pure IP solutions. How can we bring this analog infrastructure into our shiny new cloud applications? There are proven methods to bind both legacy voice applications and traditional PSTN audio into cloud-based applications and services at a carrier scale. Some of the most successful implementations leverage WebRTC, WebSockets, SIP and other open source technologies. In his session at @ThingsExpo, Da...
If you had a chance to enter on the ground level of the largest e-commerce market in the world – would you? China is the world’s most populated country with the second largest economy and the world’s fastest growing market. It is estimated that by 2018 the Chinese market will be reaching over $30 billion in gaming revenue alone. Admittedly for a foreign company, doing business in China can be challenging. Often changing laws, administrative regulations and the often inscrutable Chinese Interne...
24Notion is full-service global creative digital marketing, technology and lifestyle agency that combines strategic ideas with customized tactical execution. With a broad understand of the art of traditional marketing, new media, communications and social influence, 24Notion uniquely understands how to connect your brand strategy with the right consumer. 24Notion ranked #12 on Corporate Social Responsibility - Book of List.
Cloud computing is being adopted in one form or another by 94% of enterprises today. Tens of billions of new devices are being connected to The Internet of Things. And Big Data is driving this bus. An exponential increase is expected in the amount of information being processed, managed, analyzed, and acted upon by enterprise IT. This amazing is not part of some distant future - it is happening today. One report shows a 650% increase in enterprise data by 2020. Other estimates are even higher....
An IoT product’s log files speak volumes about what’s happening with your products in the field, pinpointing current and potential issues, and enabling you to predict failures and save millions of dollars in inventory. But until recently, no one knew how to listen. In his session at @ThingsExpo, Dan Gettens, Chief Research Officer at OnProcess, will discuss recent research by Massachusetts Institute of Technology and OnProcess Technology, where MIT created a new, breakthrough analytics model f...
Personalization has long been the holy grail of marketing. Simply stated, communicate the most relevant offer to the right person and you will increase sales. To achieve this, you must understand the individual. Consequently, digital marketers developed many ways to gather and leverage customer information to deliver targeted experiences. In his session at @ThingsExpo, Lou Casal, Founder and Principal Consultant at Practicala, discussed how the Internet of Things (IoT) has accelerated our abil...
Adobe is changing the world though digital experiences. Adobe helps customers develop and deliver high-impact experiences that differentiate brands, build loyalty, and drive revenue across every screen, including smartphones, computers, tablets and TVs. Adobe content solutions are used daily by millions of companies worldwide-from publishers and broadcasters, to enterprises, marketing agencies and household-name brands. Building on its established design leadership, Adobe enables customers not o...
Everyone knows that truly innovative companies learn as they go along, pushing boundaries in response to market changes and demands. What's more of a mystery is how to balance innovation on a fresh platform built from scratch with the legacy tech stack, product suite and customers that continue to serve as the business' foundation. In his General Session at 19th Cloud Expo, Michael Chambliss, Head of Engineering at ReadyTalk, will discuss why and how ReadyTalk diverted from healthy revenue an...
Cognitive Computing is becoming the foundation for a new generation of solutions that have the potential to transform business. Unlike traditional approaches to building solutions, a cognitive computing approach allows the data to help determine the way applications are designed. This contrasts with conventional software development that begins with defining logic based on the current way a business operates. In her session at 18th Cloud Expo, Judith S. Hurwitz, President and CEO of Hurwitz & ...
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform and how we integrate our thinking to solve complicated problems. In his session at 19th Cloud Expo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm ...
The Transparent Cloud-computing Consortium (abbreviation: T-Cloud Consortium) will conduct research activities into changes in the computing model as a result of collaboration between "device" and "cloud" and the creation of new value and markets through organic data processing High speed and high quality networks, and dramatic improvements in computer processing capabilities, have greatly changed the nature of applications and made the storing and processing of data on the network commonplace.
Digital transformation is too big and important for our future success to not understand the rules that apply to it. The first three rules for winning in this age of hyper-digital transformation are: Advantages in speed, analytics and operational tempos must be captured by implementing an optimized information logistics system (OILS) Real-time operational tempos (IT, people and business processes) must be achieved Businesses that can "analyze data and act and with speed" will dominate those t...
Almost two-thirds of companies either have or soon will have IoT as the backbone of their business in 2016. However, IoT is far more complex than most firms expected. How can you not get trapped in the pitfalls? In his session at @ThingsExpo, Tony Shan, a renowned visionary and thought leader, will introduce a holistic method of IoTification, which is the process of IoTifying the existing technology and business models to adopt and leverage IoT. He will drill down to the components in this fra...
As ridesharing competitors and enhanced services increase, notable changes are occurring in the transportation model. Despite the cost-effective means and flexibility of ridesharing, both drivers and users will need to be aware of the connected environment and how it will impact the ridesharing experience. In his session at @ThingsExpo, Timothy Evavold, Executive Director Automotive at Covisint, will discuss key challenges and solutions to powering a ride sharing and/or multimodal model in the a...
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 19th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world and ThingsExpo Silicon Valley Call for Papers is now open.
I'm a lonely sensor. I spend all day telling the world how I'm feeling, but none of the other sensors seem to care. I want to be connected. I want to build relationships with other sensors to be more useful for my human. I want my human to understand that when my friends next door are too hot for a while, I'll soon be flaming. And when all my friends go outside without me, I may be left behind. Don't just log my data; use the relationship graph. In his session at @ThingsExpo, Ryan Boyd, Engi...
SYS-CON Events announced today that Pulzze Systems will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Pulzze Systems, Inc. provides infrastructure products for the Internet of Things to enable any connected device and system to carry out matched operations without programming. For more information, visit http://www.pulzzesystems.com.
Why do your mobile transformations need to happen today? Mobile is the strategy that enterprise transformation centers on to drive customer engagement. In his general session at @ThingsExpo, Roger Woods, Director, Mobile Product & Strategy – Adobe Marketing Cloud, covered key IoT and mobile trends that are forcing mobile transformation, key components of a solid mobile strategy and explored how brands are effectively driving mobile change throughout the enterprise.