This project will aim to integrate two categories of models into mhub.ai
1. CPU friendly (whole-body) auto-segmentation models
2. CT utility model for image QA
mhub.ai
platformmhub.ai
platformCPU-friendly auto-seg Several auto-segmentation models have been integrated into slicer recently through https://github.com/lassoan/SlicerMONAIAuto3DSeg/releases/tag/ModelsTestResults
While the quick version of these models run fast on CPU, the slower versions take a couple of mins. It would be interesting to explore if CPU related optimizations would work to increase the speed and reduce memory of the full resolution versions while making the quick versions even faster.
Some initial thoughts on optimization techniques could include,
Converting models to OpenVINO format for optimized inference on CPU (https://docs.openvino.ai/2024/home.html, https://docs.openvino.ai/2024/omz_demos_3d_segmentation_demo_python.html). This could provide faster inference and make models more lightweight offering a better user experience as well.
CT utility models Implementing CT image inspection utility models, namely, body part regression - https://github.com/MIC-DKFZ/BodyPartRegression. This model allows determining the body part examined and if there are anomalies in certain slices in the processed image (nifti).
Integrating this into Mhub would allow users to perform this QA by providing DICOM inputs directly
model = SegResNet(out_channels=1)
model.to("cpu")
# Initialize variables for benchmarking
total_time = 0
num_runs = 500
# Set the model to evaluation mode
model.eval()
with torch.no_grad():
with autocast(enabled=True):
for _ in range(num_runs):
# Generate random input tensor
torch_input = torch.randn(1, 1, 64, 128, 128)
# Measure inference time
start_time = time.time()
model(torch_input)
end_time = time.time()
# Accumulate total inference time
total_time += end_time - start_time
# Calculate average inference time
avg_time = total_time / num_runs
# Print the average execution time
print(f"Average execution time over {num_runs} runs: {avg_time:.5f} seconds")
Result:
Average execution time over 100 runs: 0.24193 seconds
# Initialize OpenVINO core and read the model
core = ov.Core()
model = core.read_model("segresnet.xml")
compiled_model = core.compile_model(model, "CPU")
# Create an infer request
infer_request = compiled_model.create_infer_request()
# Number of iterations for benchmarking
num_iterations = 500
execution_times = []
# Generate random input tensor with the correct shape
input_shape = (1, 1, 64, 128, 128)
for i in range(num_iterations):
# Create tensor from external memory
torch_input = torch.randn(*input_shape)
input_tensor = ov.Tensor(array=torch_input.numpy(), shared_memory=True)
# Measure inference time
start_time = time.time()
# Set input tensor for model with one input
infer_request.set_input_tensor(input_tensor)
infer_request.start_async()
infer_request.wait()
# Get output tensor for model with one output
output = infer_request.get_output_tensor()
output_buffer = output.data
end_time = time.time()
execution_times.append(end_time - start_time)
# Calculate average execution time
avg_execution_time = sum(execution_times) / num_iterations
print(f"Average execution time over {num_iterations} iterations: {avg_execution_time:.5f} seconds")
Result:
Average execution time over 100 iterations: 0.19558 seconds
There seems to be speed up with OV. More investigation needed for the exact patch-size that AutoSeg3D uses + for sliding window inference setting.
BPREG: