Untitled Document
www.expresscomputeronline.com WEEKLY INSIGHT FOR TECHNOLOGY PROFESSIONALS
06 October 2008  
Untitled Document
Sections

Market
Management
Technology
Technology Life

Express Intelligent Enterprise

Events

Technology Senate
Technology Sabha

Services
Subscribe/Renew
Archives
Search
Contact Us
Network Sites
CIO Decisions
Exp.Channel Business
Express Hospitality
Express TravelWorld
feBusiness Traveller
Express Pharma
Express Healthcare
Express Textile
Group Sites
ExpressIndia
Indian Express
Financial Express

Untitled Document
 
Home - Market - Article

30 Minute Interview

Intelligent architecture

Eric Demers, Team Leader - Engineering, AMD, spoke to Nivedan Prakash about the company’s teraFLOPS graphic chip, codenamed RV770, and the chip’s architecture


Eric Demers

Shrinking the die

The secret was good engineering as we spent quite a bit of time reviewing our previous products and a lot of engineering effort redesigning blocks to make them more efficient and smaller, using all we had learned to achieve that. In addition, we rebalanced compute and BW, to achieve a more balanced ratio of capabilities to bandwidth, more in line with current applications. We also changed the memory interface configuration, going for a more tuned per channel/client organization for high bandwidth clients.

There was also a significant amount of layout work done to achieve the small dies. Nearly a year before we sent the chip out for fabrication and then we started our floor planning and physical design work. The last months of the design are spent solely on physical design and achieving our projected area targets. While there is quite a bit of custom work done for all our chips (for example, all the I/O), the core design was a standard cell design for the logic section, but with custom memories to optimize area.

Changing the dispatcher

It took nearly 1.5 years of work from a design team standpoint to make the dispatcher more scalable while also offering new features and better performance. It is an evolution of the previous version and inherited the best parts, like as the ability to issue to multiple blocks in parallel such as texture and ALU and others. We tweaked and optimized the command queues to achieve better balance in the new design.

About GDDR5

The GDDR5 ATI Radeon HD 4870 boards are tuned to operate with higher memory and core speeds to get the highest performance, as compared to the ATI Radeon HD 4850 boards. As a result, they are currently more limited than the ATI Radeon HD 4950 GDDR3 boards in terms of their ability to operate at scaled down clocks when idle. It is a result of multiple constraints, but nothing inherent in the GDDR5 protocol. However, we are working on ways of improving the range of clock speeds we can support with GDDR5 boards, so we can further reduce idle power without affecting peak performance. Currently, the ATI Radeon HD 4870 boards have an idle power in the typical range for a performance board.

Micro-stuttering

Micro stuttering can be caused by multiple things. For example, for our previous product, the ATI Radeon HD 3870, one of the causes of micro stuttering was because the graphics clock was being increased and decreased too frequently, during games. The ATI Radeon HD 3870 was one of the first AMD parts to introduce a programmable micro-controller to monitor and control the chip power through clocks and voltage. The ATI Radeon HD 3870 was able to detect times when the application was not using it, and reduce its clock speed to conserve power. What we found is that within a single frame, when the CPU load was high, there were times where there was enough ‘starvation’ to cause the ATI Radeon HD 3870 to reduce its clock, even though it was running a game. When the next part of the frame came up, the graphics clock had already been reduced, so that the rendering was slowed down until the chip detected a heavy load and resume high clocks. This up/down on the clock saved power, but reduced overall performance and cause micro stuttering.

There are other potential sources for micro stuttering. Some of them, for example, have to do with moving memory around, which can cause blackouts either for the CPU or the GPU. Others exist when the CPU and GPU are more unbalanced (fast GPU, slow CPU), for example, where the CPU will not generate any frames for a while, then generate two frames. It could be that in that case, we get an average time for frame 1, which is the idle time plus render, while frame 2 will be only render. That could lead to 16ms and 1ms frame times, which would appear as stuttering (assuming 15ms idle, 1ms render times). Multi-GPU makes the problem worst, as the GPU consumption rate is even higher. We are investigating these and others, though it is a tall task to fix all of them while also achieving peak performance.

 


Untitled Document

UNSUBSCRIBE HERE
Untitled Document
© Copyright 2001: Indian Express Newspapers (Mumbai) Limited (Mumbai, India). All rights reserved throughout the world. This entire site is compiled in Mumbai by the Business Publications Division (BPD) of the Indian Express Newspapers (Mumbai) Limited. Site managed by BPD.