📌 Key Takeaways
International supply risk stems from evaluating price while ignoring two critical dimensions: whether the supplier can execute reliably and whether their quality claims rest on verifiable proof.
- Two Orthogonal Axes Prevent Blind Spots: Exporter Reliability (60% weight) measures observable operational performance—delivery timeliness, document accuracy, dispute frequency—while Evidence Integrity (40% weight) scrutinizes whether certificates reference recognized test methods like TAPPI T 410 or ISO 2758, creating a complete risk picture that price alone obscures.
- Method-Named Certificates Separate Credible Claims from Noise: A certificate stating “basis weight: 150 g/m²” is unverifiable; “basis weight: 150 g/m² per TAPPI T 410” enables comparison across suppliers and alignment with your internal testing, eliminating specification disputes before they arise.
- Green-Amber-Red Thresholds Drive Cross-Team Consensus: Scores of 80–100 signal award-ready suppliers, 60–79 require conditional approval with tighter oversight, and below 60 triggers rejection or metric improvement before reconsideration—giving procurement, QA, and operations a shared decision language.
- The Matrix Travels with Your RFQ as Risk Documentation: A one-page table listing each metric, its 12-month value, normalized score, and weight creates an audit trail that justifies supplier selection to stakeholders and provides baseline data for quarterly performance tracking.
- Lab Accreditation Under ISO/IEC 17025 Establishes Technical Credibility: Accredited labs undergo regular audits confirming calibrated equipment, trained personnel, and documented procedures—suppliers unwilling to use accredited testing or provide method names signal systemic QA weakness worth avoiding.
Prepared = defensible shortlists and fewer post-award disputes.
Small business procurement managers, converting operations leads, and QA professionals evaluating international kraft paper suppliers will find a practical framework here, setting the stage for the detailed metric definitions and implementation guidance that follows.
International kraft paper sourcing carries inherent risk. You’re evaluating suppliers thousands of miles away, often with limited visibility into their operational track record or quality assurance practices. A single poor decision—awarding a contract based on price alone—can cascade into missed shipments, document errors, and costly specification disputes that disrupt your production schedule.

The challenge isn’t just finding suppliers who claim they can deliver. It’s building a defensible, evidence-based shortlist that your entire team—procurement, quality assurance, and operations—can approve with confidence. You need a framework that evaluates what matters: observable reliability and verifiable quality proof.
This article introduces a two-axis Integration Score model that combines Exporter Reliability metrics with Evidence Integrity standards. The result is a transparent, repeatable evaluation system that travels with your RFQ and reduces downstream friction before you commit.
Why a Two-Axis Model De-Risks International Supply
Most sourcing decisions rely heavily on price comparisons and supplier promises. The problem with this approach is that it treats all quotes as equally credible, ignoring two critical dimensions: whether the supplier can actually execute reliably, and whether their quality claims are backed by method-named, standards-aligned proof.
The Integration Score addresses this gap by evaluating suppliers across two orthogonal axes. The first axis, Exporter Reliability, examines observable performance history—on-time delivery rates, documentation accuracy, dispute frequency, and operational maturity. The second axis, Evidence Integrity, scrutinizes the quality of the supplier’s technical proof, specifically whether their certificates of analysis reference recognized test methods like TAPPI T 410 for basis weight or ISO 2758 for burst strength.
By combining these two dimensions into a single numerical score (0–100), you create a common language that purchasing, QA, and operations can all understand. More importantly, you shift the conversation from “Who gave us the lowest price?” to “Which supplier presents the lowest risk at an acceptable cost?”
The model assigns 60% weight to Exporter Reliability and 40% to Evidence Integrity. This weighting reflects a practical reality: even perfect lab certificates don’t help if the supplier consistently ships late or submits incorrect commercial documentation. However, strong delivery performance loses value if the material itself fails to meet specifications because the supplier’s QA process lacks rigor.
Axis 1: Exporter Reliability (60% of Total Score)
Exporter Reliability measures how consistently a supplier executes the operational fundamentals of international trade. This axis comprises six distinct metrics, each normalized to a 0–100 scale before applying its weight.
On-Time Shipment Rate (20%)
This metric captures the percentage of shipments where the supplier met the agreed estimated time of departure (ETD) or estimated time of arrival (ETA) within an acceptable tolerance window, typically ±2 days. Calculate this over the most recent 12-month period to capture seasonal variations and recent performance trends. A supplier with a 95% on-time rate scores near 100 on this metric; a supplier at 70% scores proportionally lower.
Track both ETD adherence (did they ship when promised?) and ETA accuracy (did the cargo arrive as forecasted?). The distinction matters because ETD failures signal internal production or logistics planning issues, while ETA variances often reflect broader supply chain disruptions outside the supplier’s direct control.
Documentation Accuracy (15%)
International shipments require a complex web of commercial documents: commercial invoice, packing list, bill of lading, verified gross mass (VGM) declaration, and shipper’s instructions. Errors in any of these—wrong HS codes, mismatched weights between packing list and VGM, or delayed bill of lading issuance—create customs delays, demurrage charges, and payment disputes.
Calculate documentation accuracy as the percentage of shipments where all required documents were correct on first submission and issued within agreed timelines. For example, the bill of lading should typically be issued within 2–3 business days of vessel departure. Suppliers who consistently meet this standard demonstrate operational discipline that translates to fewer surprises post-award.
According to research on supply chain documentation, documentation errors are among the top three causes of customs delays for small importers, directly impacting working capital and production schedules.
Dispute Rate (10%)
Measure the number of claims filed per 100 completed shipments over the trailing 12 months. Claims include quality disputes, short shipments, damaged cargo, or invoice discrepancies. More revealing than the raw claim count is the percentage resolved on first pass without escalation. A supplier with occasional claims but a strong resolution process may be less risky than one with fewer claims but a history of protracted disputes.
Track this metric carefully for new trading lanes or products. A supplier with an excellent dispute rate in one grade (say, virgin kraft liner) may have limited experience in another (recycled testliner), which increases risk.
Years Exporting and Trade References (5%)
Operational maturity matters. A supplier who has been exporting for 10+ years and can provide verified customer references for the specific lane you’re evaluating (for example, Indonesia to your destination port) demonstrates that they’ve navigated the learning curve. Request at least two trade references and confirm them independently. Ask references about communication responsiveness, claim handling, and whether they would award repeat business.
New exporters aren’t automatically disqualified, but they should demonstrate compensating strengths in other reliability metrics or strong backing from an established mill or trading group.
Incoterms Adherence and Milestone Timeliness (5%)
Incoterms define the precise responsibilities of buyer and seller at each stage of shipment. A supplier quoting CFR (Cost and Freight) must arrange and pay for main carriage to the destination port but transfers risk to the buyer once goods cross the ship’s rail at origin. A supplier quoting CIF (Cost, Insurance, and Freight) must additionally provide minimum insurance coverage.
Evaluate whether the supplier clearly states which Incoterm applies and whether they consistently fulfill their obligations. For instance, under CIF terms, does the supplier provide an insurance certificate showing Institute Cargo Clauses coverage? Do they issue the bill of lading promptly, or do delays force you to chase documentation?
Clarity and consistency here prevent mid-transaction disputes over who pays for what and when risk transfers.
Quality System and Chain-of-Custody Certifications (5%)
Certifications like ISO 9001 (quality management systems) signal that a supplier operates under documented, audited procedures. For buyers requiring certified fiber, FSC Chain of Custody certification demonstrates traceability through the supply chain. These aren’t guarantees of perfection, but they do indicate organizational maturity and a commitment to third-party accountability.
Verify certifications directly with the issuing body rather than relying solely on scanned certificates. Most registries maintain online databases where you can confirm the scope and validity of a supplier’s certification.
Axis 2: Evidence Integrity (40% of Total Score)
Evidence Integrity shifts the focus from operational performance to the quality and credibility of the supplier’s technical proof. This axis prevents scenarios where a supplier with strong delivery performance ships material that fails specifications because their QA process was inadequate.
Certificate of Analysis Completeness with Method-Named Values (20%)

A meaningful certificate of analysis (COA) includes not just test results but also the specific method used to generate each result. For example, stating “basis weight: 150 g/m²” is insufficient. A complete COA states “basis weight: 150 g/m² per TAPPI T 410” or “burst strength: 350 kPa per ISO 2758.”
Method names matter because different test methods can yield different results for the same material property. Tensile strength measured by TAPPI T 494 involves specific specimen dimensions, load application rates, and break detection criteria. A supplier who omits the method name makes it impossible to verify whether their results are comparable to your internal testing or previous supplier data.
Evaluate COA completeness across the properties most critical to your application: basis weight, tensile or burst strength, moisture content, and Cobb value for moisture absorption. A supplier who method-names all core properties scores well on this metric; a supplier providing generic “test reports” without method references scores poorly.
Lab Credibility: ISO/IEC 17025 Accreditation (10%)
Not all labs are created equal. ISO/IEC 17025 is the international standard for the competence of testing and calibration laboratories. Accredited labs undergo regular audits to confirm they use calibrated equipment, employ trained personnel, follow documented procedures, and maintain traceability.
When evaluating a supplier’s COA, verify whether the issuing lab holds ISO/IEC 17025 accreditation for the specific tests performed. Many national accreditation bodies maintain searchable online databases. If the supplier uses an in-house lab that lacks accreditation, request evidence of third-party verification or ask them to commission tests from an accredited facility for critical shipments.
This isn’t about distrusting the supplier; it’s about establishing a chain of technical credibility that protects both parties if disputes arise.
Sampling Protocol and Lot Traceability Declared (5%)
A COA based on a single sample from one end of one roll tells you very little about the entire shipment. Robust sampling follows documented protocols: sampling multiple rolls across the lot, taking samples from both machine direction (MD) and cross direction (CD), and clearly identifying the lot number that the samples represent.
Suppliers who document their sampling approach—how many rolls sampled, where samples were taken, and how samples were prepared—demonstrate transparency. This becomes critical if your receiving inspection contradicts the supplier’s COA. With lot-level traceability, you can pinpoint whether the discrepancy reflects natural material variation, sampling differences, or a genuine quality issue.
Third-Party Verification Available Upon Request (5%)
Some buyers, particularly for high-value orders or new supplier relationships, prefer third-party verification: an independent inspection agency witnesses sampling and testing, or an accredited lab re-tests retained samples. Suppliers who can accommodate this request, even if it’s rarely exercised, signal confidence in their QA process.
This metric doesn’t require that every shipment undergo third-party verification. It simply evaluates whether the supplier’s systems can support it when needed.
The Scoring Rubric and Decision Thresholds

The Integration Score combines the two axes using the formula:
Total Score = (0.60 × Reliability Score) + (0.40 × Evidence Integrity Score)
Both sub-scores are normalized to a 0–100 scale before applying the weights. The resulting total score maps to three decision thresholds:
- Green (80–100): Award-ready if commercial terms are acceptable. These suppliers present low risk.
- Amber (60–79): Conditional approval. Consider tightening specification tolerances, requiring more frequent testing, or increasing document review rigor. This tier is useful for newer suppliers building a track record.
- Red (<60): Do not award. The risk profile is too high. Either work with the supplier to improve their metrics or remove them from consideration.
These thresholds aren’t arbitrary. They reflect a practical reality: buyers operating with lean teams can’t afford to micromanage every supplier. Green-tier suppliers earn autonomy because their track record justifies it. Amber-tier suppliers require closer oversight. Red-tier suppliers consume more resources than they’re worth.
The Integration Score Matrix: A One-Page Decision Tool

The practical output of this model is a single-page matrix that travels with your RFQ and approval documentation. The matrix should include these columns:
- Metric: The specific measure being evaluated (e.g., “On-Time Shipment Rate”)
- Definition: A brief, unambiguous description of how the metric is calculated
- Data Source: Where the data comes from (supplier self-report, verified references, third-party audit)
- Unit: The measurement unit (%, count per 100 shipments, years)
- 12-Month Value: The supplier’s actual performance figure
- Normalized Score (0–100): The raw value converted to a common scale
- Weight: The metric’s importance in the total score (e.g., 0.20 for On-Time Shipment Rate)
- Weighted Score: Normalized Score × Weight
Attach a brief legend explaining how normalization works and how to interpret missing data. For example, if a supplier can’t provide a 12-month on-time shipment rate because they’re new to the lane, you might provisionally assign a score of 50 (neutral) and require them to submit updated data after six months.
Include a visual indicator—Green, Amber, or Red—based on the total score. This makes it easy for non-technical stakeholders to quickly assess supplier risk.
You can explore similar supplier evaluation frameworks and connect with verified kraft paper suppliers through the PaperIndex supplier directory.
How to Operationalize: From RFQ to Approval
Implementing the Integration Score requires a shift in how you structure RFQs and supplier reviews.
Before Issuing the RFQ
Define what evidence you’ll require from suppliers. This typically includes a completed COA with method-named values, proof of lab accreditation (or willingness to use an accredited lab), a documented sampling protocol, and lot traceability procedures. Also specify the operational data you’ll need: on-time delivery statistics for the trailing 12 months, a summary of any quality disputes, and trade references for the relevant lane.
Some of this data won’t exist for suppliers who haven’t worked with you before. That’s expected. What matters is their willingness to provide it going forward and their transparency about current capabilities.
At RFQ Stage
Require suppliers to submit the filled Integration Score Matrix alongside their commercial quote. This means they must self-assess on each metric and provide supporting evidence. Suppliers who balk at this requirement are often the same ones who would later create operational headaches. Requiring the matrix up front filters for suppliers who take documentation seriously.
During Cross-Team Review
Once you’ve received quotes and matrices, convene a brief review session with representatives from procurement, QA, and operations. Each function brings a different perspective:
- Procurement focuses on cost-to-door and payment terms but uses the Integration Score to assess risk.
- QA scrutinizes the Evidence Integrity metrics, particularly method alignment and lab credibility.
- Operations evaluate Exporter Reliability, especially on-time delivery and documentation accuracy, because these directly impact production scheduling.
Use the score thresholds to drive consensus. A supplier in the Green tier with competitive pricing is usually an easy approval. An Amber-tier supplier might be worth the risk if they offer significantly better terms and you can mitigate risk through more frequent inspections. A Red-tier supplier requires a compelling compensating factor—perhaps they’re the only source for a niche grade—to justify the risk, and even then, you’d structure the contract with strict performance gates.
After Award
Archive the Integration Score Matrix and all supporting documentation with the RFQ file. This creates an audit trail if disputes arise and provides a baseline for future evaluations. Every quarter, update the supplier’s reliability metrics based on actual performance. If a Green-tier supplier’s on-time rate drops below 85%, move them to Amber and adjust oversight accordingly. If an Amber-tier supplier consistently exceeds expectations, promote them to Green and reduce inspection frequency.
This continuous feedback loop transforms the Integration Score from a one-time evaluation tool into an ongoing supplier performance management system.
For buyers looking to implement this framework, the PaperIndex Academy offers additional resources on specification-first sourcing. If you’re ready to connect with suppliers who can provide method-named evidence, submit your RFQ with your evidence requirements clearly stated. Suppliers can join PaperIndex to access global buyer networks and demonstrate their quality capabilities.
Frequently Asked Questions
What if a supplier won’t provide method names on their certificates?
This is a red flag. Method names are standard practice for any supplier working with quality-conscious buyers. A supplier’s refusal or inability to provide them suggests one of three problems: their lab doesn’t follow recognized standards, they’re unfamiliar with export market requirements, or they’re intentionally obscuring their testing process. In any case, it signals risk. You can attempt to educate the supplier on your requirements—provide them with examples of proper method-named COAs—but if they remain unwilling, move on. The global kraft paper market has enough capacity that you don’t need to accept non-standard documentation practices.
How do we verify a lab’s ISO/IEC 17025 accreditation status?
Most countries have a national accreditation body that maintains an online registry of accredited laboratories. For example, in the United States, the ANSI National Accreditation Board (ANAB) provides a searchable database. In Europe, member bodies of the European co-operation for Accreditation (EA) maintain similar registries. When a supplier provides a COA, note the lab name and accreditation number (usually printed on the certificate), then search the relevant registry to confirm the lab’s accreditation is current and covers the specific test methods performed. If the supplier uses a lab in a country you’re unfamiliar with, ask them to provide the lab’s accreditation certificate or a direct link to the accreditation body’s verification page.
Can we adjust the weights for different risk profiles?
Absolutely. The 60/40 split between Reliability and Evidence Integrity is a starting point based on typical procurement priorities. If you’re sourcing a grade with extremely tight specifications—say, food-contact kraft paper where contamination risk is high—you might increase Evidence Integrity to 50% and reduce Reliability to 50%. Conversely, if you’re buying a commodity grade where minor spec variations are easily absorbed, you might weight Reliability at 70% because delivery predictability matters more than testing precision. The key is to document your weighting rationale and apply it consistently across all suppliers in that category. Don’t change weights mid-evaluation to favor a preferred supplier.
How do Incoterms influence reliability metrics?
Incoterms directly affect two reliability metrics: Documentation Accuracy and Incoterms Adherence. A supplier quoting EXW (Ex Works) transfers responsibility for all transportation and documentation to you at their factory gate, so they have minimal documentation obligations. A supplier quoting DDP (Delivered Duty Paid) assumes responsibility all the way to your door, including import clearance, which dramatically increases their documentation burden. When comparing suppliers on different Incoterms, you must normalize the Documentation Accuracy metric to account for the scope of their obligations. An EXW supplier who correctly issues a packing list and commercial invoice might score 100% on documentation accuracy because that’s all they’re responsible for. A DDP supplier must handle import documentation, customs declarations, and delivery proof—a much higher bar. Consider this when setting expectations and weights.
Disclaimer: This article provides general information for B2B sourcing teams. It is not legal advice and does not replace your organization’s quality or compliance procedures.
Our Editorial Process
Our expert team uses AI tools to help organize and structure our initial drafts. Every piece is then extensively rewritten, fact-checked, and enriched with first-hand insights and experiences by expert humans on our Insights Team to ensure accuracy and clarity.
About the PaperIndex Insights Team
The PaperIndex Insights Team is our dedicated engine for synthesizing complex topics into clear, helpful guides. While our content is thoroughly reviewed for clarity and accuracy, it is for informational purposes and should not replace professional advice.
