Deep Visual Proteomics maps proteotoxicity in a genetic liver disease – Nature


Clinical cohorts and sample preparation

Patient biopsies and explant samples were obtained at two different sites, Odense University Hospital (OUH) and Aachen RWTH University Hospital (UKA). The sample origin is indicated in Supplementary Table 1. Following ethical guidelines, the clinical data provided here are deidentified by reporting only sample type, fibrosis score and site of origin.

OUH patient recruitment

Patients were recruited through the Danish patient organization (Alfa-1 Denmark) and clinical departments for liver and lung diseases as part of a cohort study. The cohort was designed to investigate liver health among nonpregnant adults (minimum age 18 years) diagnosed with AATD of any genotype and carrier status. This specific study includes 16 people diagnosed with Pi*ZZ who consented to undergo the procedure. The study was approved by the Danish Ethical Committee (S-20160187), and participants gave informed consent before enrolment. Participants without a history of liver transplant or decompensated cirrhosis were offered a percutaneous liver biopsy. The patients underwent liver core needle biopsies at OUH between 2017 and 2021. Liver core needle biopsies were taken during this period, stored in 4% formalin and embedded in paraffin. For the assessment of fibrosis stage, FFPE blocks were cut on a microtome into 3-μm-thick sections and mounted on FLEX IHC slides (Dako). Tissue sections were deparaffinized with xylene, rehydrated in serial dilutions of ethanol and stained with Sirius Red. A certified hepatopathologist (S.D.) assessed the Kleiner fibrosis stage (0–4) according to the Pathology Committee of the NASH Clinical Research Network (NAS-CRN).

UKA patient recruitment

The recruitment of patients is described in detail in ref. 41. Of this cohort, the present study includes 19 people diagnosed with Pi*ZZ, of whom 14 underwent liver core needle biopsies owing to medical indication and five received a liver transplant because of end-stage liver disease. One patient’s sample was later removed owing to its outlier position on the proteome PCA (Supplementary Table 1). Samples were stored in 4% formalin and embedded in paraffin. Fibrosis stage was assessed after trichrome staining of 5-μm-thick sections by a certified hepatopathologist. Blocks were stored at room temperature. Ethical approval was provided by the institutional review board of Aachen University (EK 173/15). All participants provided written informed consent and were treated following the ethical guidelines of the Helsinki Declaration (Hong Kong Amendment) as well as Good Clinical Practice (European guidelines).

Staining

Polyethylene naphthalate membrane slides (2 μm; MicroDissect GmbH) were exposed to ultraviolet light (254 nm) for 1 h and then coated with Vectabond (Vector Laboratories; catalogue no. SP-1800-7) according to the manufacturer’s protocol. FFPE sections (3-μm-thick, DVP, ML; 10-μm-thick, scDVP) were mounted onto these slides and dried at 37 °C overnight. Slides were stored at 4 °C until further processing, upon which slides were baked at 55 °C for 40 min and then deparaffinized and rehydrated (xylene 2 × 2 min, 100% ethanol 2 × 1 min, 90% ethanol 2 × 1 min, 75% ethanol 2 × 1 min, 30% ethanol 2 × 1 min, ddH2O 2 × 1 min). Slides were transferred to prewarmed glycerol-supplemented antigen retrieval buffer (DAKO pH 9 S2367 + 10% glycerol) at 88 °C for 20 min, followed by a 20-min cooldown at room temperature (22 °C). After washing in water, sections were blocked with 5% bovine serum albumin (BSA) in PBS for 1 h, followed by an overnight incubation with primary antibodies in 1% BSA/PBS at 4 °C in a humid staining chamber (1:200 mouse IgG1 monoclonal AAT 2C1, Hycult catalogue no. HM2289; 1:200 rabbit recombinant anti-pan-Cadh (EPR1792Y), Abcam catalogue no. ab51034). After three washes in PBS for 2 min each, secondary antibodies (1:400 goat anti-mouse IgG1, Invitrogen catalogue no. A21127; 1:400 goat anti-rabbit AF647, Invitrogen catalogue no. A21245) in 1% BSA/PBS were applied for 90 min, followed by two 2-min washes in PBS, 15 min in SYTOX Green (1:40,000 in PBS, Invitrogen catalogue no. S7020), and three final 2-min washes in PBS. Excess liquid was removed and samples were coverslipped using SlowFade Diamond Antifade Mountant (Invitrogen, catalogue no. S36963).

Imaging

Widefield imaging

For DVP and scDVP experiments (Figs. 13), sections were imaged using a Zeiss Axioscan 7. For all excitation wavelengths (493 nm, 577 nm and 653 nm), 50% light source intensity was used. The illumination time was specified on one section and applied to all consecutive samples within one experimental group. Three z-stacks at an interval of 2 μm were recorded with a Plan-Apochromat ×20, 0.8 numerical aperture M27 objective and an Axiocam 712 camera at 14-bit, with a binning of 1 and a tile overlap of 10%, resulting in a scaling of 0.173 μm × 0.173 μm. Multiscene images were then split into single scenes, z-stacks combined into a single plane using extended depth of focus (variance method, standard settings) and stitched on the pan-Cadh channel using the proprietary Zeiss Zen Imaging software.

Confocal imaging

For experiments with downstream machine learning applications (Fig. 4), sections were imaged on a Perkin Elmer OperaPhenix high-content microscope, controlled with Harmony v.4.9 software, at ×40 magnification and 0.75 numerical aperture, with a binning of 1 and a per tile overlap of 10%. Only one z-plane was recorded, which was specified manually for each slide and channel. The three channels were imaged consecutively after deactivation of simultaneous recording to avoid any leakage between channels.

Cell selection with Biological Image Analysis Software

Images were imported as .czi files into the Biological Image Analysis Software (BIAS) using the packaged import tool4. Within BIAS, images were then retiled to 1,024 × 1,024 pixels with an overlap of 10%, and empty tiles were excluded from further analyses. Outlines of all cells per biopsy were identified in an unbiased way by using Cellpose v.2.0 with the default cyto2 model based on anti-pan-Cadh stains42. Masks were imported into BIAS, and duplicates, as well as cells touching the borders of a tile (0.1% on each side), were removed. Further filtering was applied to retain cells with a minimum size of 3,000 pixels, enriching for the hepatocyte population. For classification based on low, medium and high aggregate load, all cells per biopsy or explant tissue were divided into five classes using a multilayer perceptron with the following parameters: weight scale 0.01; momentum 0.01; maximum iterations 10,000; epsilon 0.0005 and five neurons in the hidden layer. Classification was based on the AAT maximum, median and mean intensity within the cell outline mask, involving no human intervention. The low class was attributed to the cells with the lowest normalized mean intensity, medium to the third highest and high to the highest normalized mean intensity; the other two intermediate classes were dropped. Reference points were selected on the basis of prominent nuclear and histological features; 100 cells were picked randomly for excision.

For single shape experiments, six characteristic low fibrosis samples (all F1) and regions were selected that presented with a clear border-like phenotype (that is, a row of AAT+ cells in direct neighbourhood to AAT cells) or with single AAT+ cells surrounded by AAT cells. The cells were selected manually in BIAS, starting from the innermost cell and moving spiral-like to the outermost cell, thus avoiding cross-contamination of consecutively cut material.

Single-cell image generation

Images were flat-field corrected during image acquisition using the Perkin Elmer Harmony software (v.4.9). Stitching of the flat-field corrected image tiles was performed using the Python library scPortrait (https://github.com/MannLabs/scPortrait). The stitched tile positions were calculated using the anti-pan-Cadh stains imaged in the Alexa647 channel as a reference and then transferred to the other image channels. During stitching, the tile overlap was set to 0.1, the filter sigma parameter to 1 and the max shift parameter to 50.

The stitched images were then further processed in scPortrait. Cell outlines were identified on the basis of the seven times downsampled anti-pan-Cadh stains using Cellpose v.2.0 with the pretrained ‘cyto’ model42. Segmentation was performed in a tiled mode with a 100-pixel overlap. After resolving the cell outlines from overlapping regions, the resulting segmentation mask was upscaled to the original input dimensions during which the edges of the masks were smoothened by applying an erosion and dilation operation with a kernel size of 7.

Then, the generated segmentation mask was used to extract single-cell image datasets with a size of 280 pixels × 280 pixels. During extraction, the same single-cell image masks are used to obtain the pixel information from each channel for each cell. The resulting single-cell images were then rescaled to the [0, 1] range while preserving relative signal intensities. The resulting single-cell image datasets were filtered to contain only cells from within manually annotated regions in the tissue section containing hepatocytes but not fibrotic tissue.

Cell selection with the convolutional neural network

The filtered single-cell image datasets produced by scPortrait were further filtered to remove any cells that fell outside the 5–97.5% size percentile. Representations of the remaining cells were generated by featurization using the natural image-pretrained ConvNext model31. For this, the single-cell images depicting the Alpha-1 channel were rescaled to the expected image dimensions of N pixels × N pixels and triplicated to generate a pseudo-rgb image. Inference was then performed using the huggingface transformers package v.4.26 (ref. 43).

The resulting 2,048 image features were projected into a two-dimensional space using the UMAP algorithm44. The UMAP dimensions were calculated on the basis of the first 50 principal components and the 15 nearest neighbours. Using the spectral clustering algorithm from scikit-learn45, the resulting UMAP space was split into 50 clusters. The geometric centre of each cluster was calculated and the 50 cells with the smallest Euclidean distance to the cluster centre were selected for laser microdissection.

Contour outlines of the selected cells were generated in scPortrait using the py-lmd package46, whereby the cell outlines were dilated with a kernel size of 3 and a smoothing filter of 25 was applied. Furthermore, the number of points defining each shape were compressed by a factor of 30 to improve laser microdissection cutting performance. The cutting path, that is, which cell is cut after one another, was optimized using the Hilbert algorithm (https://github.com/galtay/hilbertcurve).

Laser microdissection

After aligning the reference points, contour outlines were imported, and shapes were cut using the LMD7 (Leica) laser microdissection system in a semi-automated mode with the following settings: power 45; aperture 1; speed 40; middle pulse count 1; final pulse 0; head current 42–50%; pulse frequency 2,982 and offset 190. The microscope was operated with the LMD beta v.10 software, calibrated for the gravitational stage shift into 384-well plates (Eppendorf, catalogue no. 0030129547), leaving the outermost rows and columns empty. To prevent sorting errors, a ‘wind shield’ plate was placed on top of the sample stage. Plates were then sealed, centrifuged at 1,000g for 5 min, and subsequently frozen at −20 °C for further processing.

Peptide preparation and Evotip loading

Peptides were prepared as described previously using a BRAVO pipetting robot (Agilent)47. Briefly, 384-well plates were thawed, and shapes (both combined and individual) were rinsed from the walls into the bottom of the well with 28 μl of 100% acetonitrile (ACN). The wells were dried completely in a SpeedVac at 45 °C, followed by the addition of 6 μl of 60 mM triethylammonium bicarbonate (Supelco, catalogue no. 18597) (pH 8.5) supplemented with 0.013% n-dodecyl-beta-d-maltoside (Sigma-Aldrich, catalogue no. D5172). Plates were sealed and incubated at 95 °C for 1 h. After adjusting to 10% ACN, samples were incubated again at 75 °C for 1 h. Subsequently, 6 ng and 4 ng of trypsin and Lys-C protease, respectively, in 1 μl of 60 mM triethylammonium bicarbonate buffer were added to each sample, and proteins were digested for 16 h at 37 °C. The reaction was quenched by adding trifluoroacetic acid to a final concentration of 1%. Peptide samples were then frozen at −20 °C.

For loading, new Evotips were first soaked in 1-propanol for 1 min, then rinsed twice with 50 μl of buffer B (ACN with 0.1% formic acid). After another 1-propanol soaking step for 3 min, the tips were equilibrated with two washes of 50 μl buffer A (0.1% formic acid). Samples were loaded into 70 μl of preloaded buffer A. Following one additional buffer A wash, the peptide-containing C18 disk was overlaid with 150 μl buffer A and centrifuged briefly through the disk. All centrifugation steps were performed at 700g for 1 min. The final tips were stored in buffer A for a maximum of 4 days before liquid chromatography (LC)-MS.

LC-MS data acquisition

The peptide samples were analysed using an Evosep One LC system (Evosep) coupled to an Orbitrap Astral mass spectrometer (Thermo Fisher Scientific). Peptides were eluted from the Evotips with up to 35% ACN and separated using an Evosep low-flow ‘Whisper’ gradient for DVP samples, or an experimental Evosep ‘Whisper Zoom’ gradient for single shapes and DVP-machine learning samples, with a throughput of 40 samples per day on an Aurora Elite TS column of 15-cm length, 75-μm-internal diameter, packed with 1.7 μm C18 beads (IonOpticks). The column temperature was maintained at 50 °C using a column heater (IonOpticks).

The Orbitrap Astral mass spectrometer was equipped with a FAIMS Pro interface and an EASY-Spray source (both Thermo Fisher Scientific). A FAIMS compensation voltage of −40 V and a total carrier gas flow of 3.5 l min−1 were used. An electrospray voltage of 1,900 V was applied for ionization, and the radio frequency level was set to 40. Orbitrap MS1 spectra were acquired from 380 to 980 m/z at a resolution of 240,000 (at m/z 200) with a normalized automated gain control (AGC) target of 500% and a maximum injection time of 100 ms.

For the Astral MS/MS scans in data-independent acquisition (DIA) mode, we determined the optimal methods experimentally across the precursor selection range of 380–980 m/z: (1) for DVP samples, a window width of 5 Th, a maximum injection time of 10 ms and a normalized AGC target of 800% were used. (2) For DVP-machine learning samples, a window width of 6 Th, a maximum injection time of 13 ms and a normalized AGC target of 500% were applied. (3) For single shapes and other DIA scans, the window width was optimized on the basis of precursor density across the selection range of 380–980 m/z. A total of 45 variable-width DIA windows (Supplementary Table 3) were acquired with a maximum injection time of 28 ms and an AGC target of 800%. The isolated ions were fragmented using higher-energy collisional dissociation with 25% normalized collision energy. Detailed method descriptions are provided in a default format with each supplementary data table.

Spectral searches and normalization

The raw files were searched together with match-between run-in library-free mode within each experimental group with DIA-NN v.1.8.1 (ref. 48). A FASTA file containing only canonical sequences was obtained from Uniprot (20,404 entries, downloaded on 2 January 2023), and the disease-causing amino acid was changed manually (E342K). We allowed a missed cleavage rate of up to 1, and set mass accuracy to 8, MS1 accuracy to 4 and the scan window to 6. Proteins were inferred on the basis of genes, and the neural network classifier was set to ‘single-pass mode’. For DVP and DVP-machine learning samples, precursor intensities in the ‘report.tsv’ file were then normalized using the directLFQ GUI at standard settings including a minimum number of non-NaN ion intensities required to derive a protein intensity of 1 (ref. 49). The single shape data was additionally median-normalized to a set of proteins quantified across all samples (451 proteins quantified in 100% of included samples; Supplementary Table 3), thereby correcting for the dependence of protein numbers on shape size5.

Data analysis and statistics

Data were analysed using R v.4.4.1. The directLFQ output file ‘pg_matrix.tsv’ was used for all subsequent data analysis, including the reported protein counts. Samples were included if the number of protein groups exceeded (1) the mean − 1.5 s.d. for DVP, resulting in 5.9% (6 of 102) dropouts; (2) the mean − 0.5 s.d. for DVP-machine learning samples; (3) a fitted logarithmic curve − 1.5 interquartile ranges for scDVP, taking the relation between size and proteomic depth into account, resulting in 15.4% (40 of 259) dropouts. The lower cutoffs were selected after manual inspection of the data distribution. Although some samples were collected in technical duplicates per patient biopsy, only the first replicate was used for statistical analyses and all reported measurements were taken from distinct samples. Coefficients of variation were calculated on nontransformed intensity values. For principal component analysis (PCA), the R package PCAtools v.2.16.0 was used on a complete data matrix, removing the lower 10% of variables based on variance. Statistical analyses were performed on proteins with at least 30% data completeness across samples, assuming normality using the limma package v.3.60.3 with two-sided moderated t-tests and ‘fdr’ as a multiple testing correction method. A per patient statistical pairing was applied for DVP and single shape experiments. Intensity and fold changes are reported as log2-transformed values unless indicated otherwise. GSEA was conducted using WebGestalt 2024 against the indicated databases, with an FDR of 50. Interaction networks were calculated with STRING database at standard settings51. Plasma proteins were retrieved from the Human Protein Atlas resource section with the search term ‘sa_location:Secreted to blood AND tissue_category_rna:liver;Tissue enriched’52. The timing of responses ranked by the absolute difference between B values of limma’s moderated t-test comparing three AAT load groups: low to moderate, and moderate to high. Only proteins with more than 70% data completeness and significance (FDR 2-transformed intensity values of individual proteins in samples with log2(AAT)-intensity P values are corrected for multiple testing using the ‘fdr’ method. Spatial data was mapped using the ‘simple features’ package. Binned expression presented in supplementary tables was constructed by grouping AAT or CRP expression into ten equidistant bins and on median expression of proteins across samples in each bin.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.



Source link

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay Connected

0FansLike
0FollowersFollow
0SubscribersSubscribe

Latest Articles